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During  the  analysis,  eight  measures  of  intelligence  and 
academic  ability  were  used  as  explanatory  variables.  Four 
control  variables  were  included  in  the  analysis  to  discriminate 
between  subcategories  of  NCO's.  They  were:  sex,  career  field, 
race,  and  paygrade. 

Throughout  the  analysis  consideration  of  Army  promotion  and 
accession  policy  was  included.  Knowledge  of  these  policies 
resulted  in  elimination  of  some  special  groups  which  had 
received  promotions  under  significantly  different  conditions 
than  the  rest  of  the  sample.  An  example  of  this  was  Reserve 
and  National  Guard  members  called  to  active  duty. 

This  study  found  that  there  was  significnat  statistical 
evidence  to  show  that  a  high  level  of  Armed  Forces  Qualification 
Test  (AFQT)  score  and  prior  service  academic  accomplishment  will 
correspond  to  a  higher  promotion  rate.  Also,  in-service 
measures  of  NCO  education  and  performance  testing  were  good 
indicators  of  promotion  rate. 

However,  there  was  significant  variance  associated  with  the 
explanatory  relationship.  As  a  result,  a  useful  predictive 
model  could  not  be  designed  using  regression  methods.  Although 
the  model  could  predict  promotion  averages  for  major  population 
subcategories,  it  was  unreliable  when  used  solely  with  the  AFQT 
variable. 

The  findings  of  this  study  suggest  two  policy  recommenda¬ 
tions.  The  first  recommendation  was  a  confirmation  of  the 
constraints  placed  on  AFQT  category  and  high  school  diploma 
status  by  the  1984  Defense  Authorizations  Act.  The  second 
recommendation  was  to  require  promotion  boards  to  consider  NCO 
schooling  level  and  performance  test  scores  in  their  proceedings, 
but  to  avoid  directly  tying  either  score  to  promotion,  in  terms 
of  a  minimum  quota  or  scaled  promotion  point  scale. 

Finally,  a  suggestion  was  given  for  further  research  to 
investigate  the  underlying  reasons  for  different  attrition 
patterns  observed  among  racial  and  ethnic  groups. 
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ABSTRACT 


This  thesis  systematically  and  comprehensively  analyzes 
available  personnel  data  to  determine  if  a  significant 
relationship  exists  between  measures  of  intelligence  and 
academic  performance,  and  career  promotion  rate  for 
Noncommissioned  Officers .  Forty  thousand  Noncommissioned 
Officer  (NCO)  records  were  analyzed  to  determine  this,  using 
three  approaches. 

The  first  approach  was  a  sequential  procedure  which 
progressed  from  analysis  of  individual  variables  through 
multivariate  regression  models.  The  second  approach  focused 
on  analysis  of  NCO's  who  scored  in  the  top  three  percent  of 
promotion  rate.  The  third  approach  used  more  advanced 
statistical  techniques,  including  the  use  of  principal 
components  and  factor  analysis,  to  better  identify  the  most 
influential  explanatory  variables. 

During  the  analysis,  eight  measures  of  intelligence  and 
academic  ability  were  used  as  explanatory  variables.  Four 
control  variables  were  included  in  the  analysis  to 
discriminate  between  subcategories  of  NCO's.  They  were: 
sex,  career  field,  race,  and  paygrade. 

Throughout  the  analysis  consideration  of  Army  promotion 
and  accession  policy  was  included.  Knowledge  of  these 
policies  resulted  in  elimination  of  some  special  groups  which 
had  received  promotions  under  significantly  different 
conditions  than  the  rest  of  the  sample.  An  example  of  this 
was  Reserve  and  National  Guard  members  called  to  active  duty. 

This  study  found  that  there  was  significant  statistical 
evidence  to  show  that  a  high  level  of  Armed  Forces 
Qualification  Test  (AFQT)  score  and  prior  service  academic 
accomplishment  will  correspond  to  a  higher  promotion  rate. 
Also,  in-service  measures  of  NCO  education  and  performance 
testing  were  good  indicators  of  promotion  rate. 

However,  there  was  significant  variance  associated  with 
the  explanatory  relationship.  As  a  result,  a  useful 
predictive  model  could  not  be  designed  using  regression 
methods.  Although  the  model  could  predict  promotion  averages 
for  major  population  subcategories,  it  was  unreliable  when 
used  solely  with  the  AFQT  variable. 

The  findings  of  this  study  suggest  two  policy 
recommendations.  The  first  recommendation  was  a  confirmation 
of  the  constraints  placed  on  AFQT  category  and  high  school 
diploma  status  by  the  1984  Defense  Authorizations  Act.  The 
second  recommendation  was  to  require  promotion  boards  to 
consider  NCO  schooling  level  and  performance  test  scores  in 
their  procedings,  but  to  avoid  directly  tying  either  score  to 
promotion,  in  terms  of  a  minimum  quota  or  scaled  promotion 
point  scale. 

Finally,  a  suggestion  was  given  for  further  research  to 
investigate  the  underlying  reasons  for  different  attrition 
patterns  observed  among  racial  and  ethnic  groups. 
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I .  INTRODUCTION 


A.  BACKGROUND 

I n  a lmos t  any  organization,  one  hopes  that  individuals  at 
high  levels  of  authority  are  gifted  with  higher  than  average 
intelligence.  Correspondingly,  one  would  think  that,  given 
equal  work  effort,  a  more  intelligent  person  will  advance 
more  rapidly  than  his  contemporaries  in  an  organization. 

It  is  not  difficult,  however,  to  find  examples  which 
contradict  our  perceptions  of  the  role  of  intelligence  in 
career  advancement.  In  almost  any  field  one  can  remember  an 
individual  who  was  not  the  most  intellectually  gifted,  but 
through  hard  work  and  persistence,  or  other  less  quantifiable 
traits,  advanced  equally  or  better  than  persons  of  higher 
measured  mental  ability.  There  is  ample  room  for  other 
influences  to  overwhelm  the  value  of  a  person's  intelligence 
in  the  eyes  of  a  superior.  An  unattractive  personality,  an 
inability  to  apply  that  intelligence  to  the  tasks  at  hand, 
and  a  myriad  of  other  flaws  can  discredit  the  merit  of  raw 
i n  te 1 1 lgence . 

The  degree  at  which  intelligence  impacts  on  advancement 
lies  in  the  area  of  complex  interaction  between  individuals 
and  organizations.  It  carries  with  it  much  of  the 
uncertainty  of  quantification  of  human  performance. 

Despite  ample  room  for  exceptions,  the  concept  of  a 
general  reward  for  being  more  intelligent  still  seems 

1  1 


reasonable.  It  may  be 


however 


that  to  clearly  see  its 


manifestation  requires  looking  at  a  large  number  of  people 
who  have  been  affected  by  as  similar  a  set  of  opportunities 
for  advancement  as  possible.  It  is  the  task  of  this  thesis 
to  investigate  this  relationship  within  a  fairly  restricted, 
but  numerically  large  population.  The  population  is  one 
which  has  had  fundamental  raw  statistics  uniformly  obtained, 
and  where  policies  to  promote  personnel  are  unambiguous  and 
well  documented. 

B.  PURPOSE 

The  purpose  of  this  thesis  is  to  answer  a  central 
question:  Does  a  significant  relationship  exist  between 
measures  of  intelligence  and  academic  ability,  and  an 
individual's  promotion  rate  as  a  Noncommissioned  Officer? 
Put  more  simply,  does  being  smarter,  as  measured  by  initial 
test  scores,  or  being  better  schooled,  indicate  that  a  person 
will  perform  better  and,  hence,  advance  more  quickly  than  his 
peers? 

The  answer  to  this  question  has  important  implications 
for  Army  policies  of  recruitment,  retention,  and  promotion. 
It  is  also  a  matter  of  general  interest  to  social  scientists. 

C.  ORGANIZATION 

This  thesis  is  organized  fundamentally  as  a  data  analysis 
investigation.  Chapters  I  and  II  provide  preliminary 
information  on  the  nature  of  the  study  variables,  and  briefly 
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review  some  related  articles  which  have  addressed  this  topic. 
The  remaining  chapters  discuss  the  analysis  of  approximately 
f orty-thousand  Noncommissioned  Officer  (NCO)  records  using 
three  related  approaches.  The  first  approach  is  a  fairly 
standard  procedure  of  experimental  data  analysis.  This 
procedure  begins  with  analysis  of  fundamental  attributes  of 
individual  variables,  then  advances  through  successive 
increases  in  dimensionality  and  complexity.  The  second 
approach  views  a  subset  of  the  population  which  distinguishes 
itself  by  being  in  the  top  three  percent  of  the  NCO  promotion 
rates.  Comparison  of  these  top  performers  to  the  remainder 
of  the  population  identifies  attributes  which  are  found  to  be 
significantly  different,  and  hence,  are  possibly  an 
associated  cause  for  rapid  advancement.  In  the  third 
approach,  the  statistical  methods  of  principal  components  and 
factor  analysis  are  used  to  provide  an  alternative  method  of 
critical  variable  selection ,  as  well  as  to  lend  credibility 
to  the  results  of  the  other  two  approaches. 

D.  PRELIMINARY  INFORMATION 

This  section  contains  an  initial  discussion  about  the 
nature  of  the  data,  a  general  overview  of  the  Army  NCO 
promotion  system,  and  a  synopsis  of  the  analytical  tools  used 
in  this  thesis.  As  previously  mentioned,  there  is  a  degree 


of  looseness 

in 

the  effectiveness 

of 

measurement  for 

mtel  ligence 

and 

academic  data,  and 

also 

some  confounding 

phenomena  in  Army  promotion  policy.  Early  recognition  of 


,r. 


these  problems  should  set  the  degree  of  caution  which  is 
needed  in  reviewing  the  subsequent  chapters  of  analysis.  The 
section  on  analytical  tools  is  intended  to  inform  the  reader 
of  the  conditions  under  which  the  data  analysis  was 
conducted,  and  the  hardware  and  software  used. 

1 .  Intelligence  Test  Scores 
a.  General 

The  data  for  intelligence  test  scores  falls  into 
the  category  sometimes  referred  to  as  Defined  Measurement.  A 
Defined  Measurement  is  one  where  the  property  being 
considered  cannot  be  measured  directly .[ Ref .  1  :p.  6]  As  a 
result,  a  related  measure  is  substituted  for  measurement  of 


the  actual  property. 


In  this  case,  the  property  is 


intelligence,  and  the  presumed  related  measurements  are  test 
scores  from  a  particular  battery  of  tests. 

The  efficacy  of  intelligence  tests  as  a  representative 
measure  for  intellectual  ability  is  itself  an  issue 
surrounded  by  controversy.  This  controversy  has  been  the 
topic  of  entire  books  and  studies.  The  testing  done  by  the 
Army  is  the  Armed  Forces  Vocational  Aptitude  Battery,  or 
ASVAB.  Although  not  designed  specifically  as  an  intelligence 
test,  the  ASVAB  does  predict  general  trainability . 
Additional  research  has  shown  that  the  mathematical  and 
verbal  portions  of  the  ASVAB  have  a  high  correlation  to  the 
ACT,  PSAT,  and  SAT  college  entrance  examinat ions . C Ref .  21 
The  ASVAB  has  been  studied,  improved,  and  used  for  over  forty 


years.  A  recent  article  by  Jenson  [Ref  3:p.  35],  in 
Measurement  and  Evaluation  in  Counseling  and  Development, 
states : 

"To  the  degree  that  success  in  various  occupations  and 
training  programs  requires  different  levels  of  general 
ability  (often  called  intelligence  or  IQ),  an  ASVAB 
composite  (it  hardly  matters  which  one)  will  be  as 
validly  predictive  as  any  test  now  on  the  market.  .  .  It 
seems  that  the  new  ASVAB-14  is  near  the  limit  of 
refinement,  psychometrically . " 

Generally  then,  the  ASVAB  is  a  well  documented  and 
established  aptitude  test.  Although  the  military  does  not 
specifically  attempt  to  determine  the  intelligence  of  it3 
potential  candidates,  academic  portions  of  the  ASVAB  test 
have  shown  themselves  to  be  reasonably  defined  measurements 
of  intelligence. 

b.  Specific  Tests. 

The  ASVAB  consists  of  a  battery  of  ten  subtests. 
Composites  of  the  subtests  of  the  ASVAB  are  used  to  determine 
the  overall  acceptability  of  an  individual  requesting 
enlistment,  and  for  which  field  he  or  she  would  best  be 
suited.  From  the  entire  battery  of  tests,  two  derived  scores 
of  intelligence  are  taken  as  aggregate  measures  of 
intelligence.  The  first  is  the  GT,  or  general  intelligence 
score.  This  score  is  the  aggregation  of  three  submodules, 
the  word  knowledge,  paragraph  comprehension,  and  arithmetic 
reasoning.  The  second  derived  measure  of  intelligence  is  the 


Armed  Forces  Qualification  Test  Score,  or  AFQT.  This  score 


comprehension,  arithmetic  reasoning  and  numerical 
operations .[ Ref .  10: sec  1-0,  p.  1]  An  AFQT  score  is 
reported  as  a  percentile  score  representing  the  examinee's 
relative  standing  in  reference  to  a  specific  population. 

There  has  recently  been  some  additional  manipulation  of 
the  AFQT  score.  In  October  of  1984,  the  reference  population 
for  assignment  of  an  individual's  AFQT  percentile  was  shifted 
from  a  base  reference  population  of  1944  to  that  of  1980.  A 
base  reference  population  is  a  set  of  values  designed  to 
represent  how  the  raw  AFQT  scores  of  the  entire  American 
youth  population  would  be  distributed.  This  set  of  values 
was  originally  designed  in  1944,  and  had  not  been  updated 
until  1980.  This  thesis  utilized  the  1980  base  AFQT 
percentiles.  A  transformation  of  test  percentiles  for 
soldiers  who  enlisted  prior  to  1980  was  effected  by  the 
Defense  Manpower  Data  Center  (DMDC),  and  all  subsequent 
Department  of  the  Army  records  have  been  computed  based  on 
the  1980  reference.  A  listing  for  AFQT  percentile 
transformations  can  be  found  in  APPENDIX  A. 

GT  scores,  which  are  expressed  as  the  sum  of  the  raw 
test  scores,  have  not  been  manipulated.  However,  unlike  the 
the  case  with  AFQT  score,  soldiers  have  been  allowed  to 
retake  their  tests  to  increase  their  original  GT  scores. 
Retesting  was  introduced  in  1982  when  a  minimum  GT  score  of 
120  was  enforced  on  eligibility  for  promotion  to  NC0  rank. 


2. 


Academic  Scores 


a.  General 

The  data  used  for  academic  ability  is  also  a 
defined  measurement,  similar  to  the  measures  for 
intelligence.  Specifically,  the  property  of  academic  ability 
is  being  represented  by  a  simple  assignment  of  the  number  of 
years  This  value  is  independent  of  the  quality  of 

education,  and  the  grades  that  any  given  individual  may  have 
received.  This  study  assumes  that  continued  attendance  and 
progression  through  the  educational  system  is  inherently 
indicative  of  academic  ability.  For  example,  a  high  school 
graduate  has  more  academic  ability  than  an  individual  with  an 
eighth  grade  education.  The  informational  value  of  academic 
scores  is  thus,  not  as  useful  as  desired.  It  is  treated  in 
analysis  as  only  an  ordinal  scaled  variable. 

b.  Specific 

Three  academic  scores  are  used  in  the  study: 
present  education  level,  education  level  upon  entry  into 
Army,  and  military  education  since  entry.  Because  advanced 
professional  schooling  is  made  available  only  to  those 
individuals  who  have  superior  service  records,  the  military 
education  score  carries  with  it  some  additional  information 
relative  to  the  performance  of  the  NCO. 

3 .  Promotion  Scores 

Promotion  within  the  Army  is  a  closely  supervised  and 
somewhat  complicated  procedure.  It  is  the  product  of  a 
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considerable  number  of  policies  which  are  not  uniformly 


applied  across  the  population. 


Instead,  they  are  applied 


within  rank  structure,  within  career  field,  or  even  as  a 


function  of  years  of  education. 


Thus,  although  the 


computation  of  an  individual's  promotion  rate  is  an  easy 


task,  that  value  may  have  been  influenced  by  several  policies 


that  were  peculiar  to  the  individual. 


a.  General 


Promotion  of  NCO's  is  governed  by  Army  Regulation 


AR  600-200.  This  regulation  establishes  requirements  for 


eligibility,  and  outlines  the  process  of  selection.  The 


system  views  the  individual's  performance  as  a  whole.  This 


includes  a  composite  score  based  on  performance  scores. 


commander's  ratings,  service  awards,  and  review  by  a  board  of 


senior  NCO's. 


This  composite  point  value  is  used  as  a 


threshold  value  for  the  Department  of  the  Army  to  use  when 


promoting  individuals  to  the  next  higher  paygrade,  as  slots 


become  available.  The  slots  are  accounted  for  by  career 


management  field,  and  as  such,  the  minimum  threshold  for  a 


combat  soldier  to  be  promoted  may  be  different  than  that  of  a 


support  soldier.  A  general  observation  is  that  career  fields 


with  more  technical  orientation  have  higher  promotion  point 


thresholds,  and  subsequently,  longer  times  to  advancement 


than  those  in  the  larger  and  less  technically  oriented  career 


fields  . 


AR  600-200  also  sets  minimum  times  of  service  and  grade 


which  an  individual  must  have  served  to  be  considered  for 


promotion.  Unless  superceded  by  a  special  policy,  the 
shortest  period  for  promotion  to  E-5  is  two  years,  and  is 
four  years  to  E-6.  This  rate  includes  waivers  for  both  time 
in  service  and  time  in  grade.  Promotion  to  E-6  in  four  years 
requires  that  the  individual  be  advanced  to  E-5  in  two  years, 
b.  Specific 

Because  of  the  lack  of  uniformity  of  promotion 
within  the  army  population,  in  this  thesis  we  have  taken 
considerable  care  to  identify  and  address  discontinuities 
which  would  confound  promotion  based  on  merit.  This  includes 
the  elimination  of  some  data,  and  the  computation  of  three 
different  promotion  rate  scores.  The  governing  principle  for 
manipulation  or  restriction  of  data  was  to  produce  a  sample 
population  in  which  each  individual  started  from  the  same 
point  in  the  rank  structure,  and  had  equal  opportunity  for 
advancement  by  merit.  Chapter  III,  Overview  of  the  Data, 
discusses  in  detail  the  identified  problems  and  what 
corrective  action  was  taken. 

4 .  Analytical  Tools  Used 

This  section  briefly  identifies  the  hardware  and 
software  used  in  analysis, 
a.  Hardware 

Computational  resources  used  for  analysis 
included  an  IBM  3033  System  370  mainframe  computer  running 
MVS  batch  system.  Additionally,  analysis  was  done  for  small 
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data  sets  using  a  standard  IBM  microcomputer, 
b.  Software 

Two  software  packages  were  used  for  the  majority 
of  the  data  analysis.  SAS  Version  5  was  used  predominantly 
for  analysis  resulting  in  tabular  output,  such  as  principal 
components  and  factor  analysis . C Ref .  4,5]  Graf stat •  an 
unreleased  IBM  mainframe  data  analysis  and  plotting  program, 
was  utilized  for  analysis  requiring  graphical  output  and  for 
confirmation  of  SAS  tabular  resul ts . C Ref .  6,7] 

E.  SUMMARY 

The  objective  of  this  introduction  has  been  to  adequately 
frame  the  scope  of  the  topic,  and  to  present  sufficient 
background  to  the  reader  so  that  he  or  she  is  alerted  to  some 
of  the  difficulties  inherent  in  a  topic  of  this  nature. 
Also,  this  will  establish  a  reference  for  some  of  the  tools 
used  to  conduct  the  analysis. 

The  length  of  this  section  is  indicative  of  the  degree  of 
preparation  required  to  analyze  a  relationship  which  has 
significant  complications  in  both  dependent  and  independent 
variables.  Although  the  list  of  assumptions  and  the 
stripping  of  aberrant  data  makes  one  cautious  about  the 
reality  of  such  a  study,  each  event  should  be  considered  on 
its  ability  to  uncover  the  answer  to  the  central  question  of 
this  thesis.  The  central  question  again  is,  whether  or  not  a 
significant  relationship  exists  between  measures  of 
intelligence  and  academic  ability,  and  an  individual's 


promotion  rate  as  a  Noncommissioned  Officer.  It  is  important 
to  learn  whether  measures  of  intelligence  and  academic 
ability  are  important  indicators  of  promotion  in  the  army, 
and  if  so,  how  strong  that  relationship  is.  If  sufficiently 
reliable  and  believable  relationships  can  be  determined,  then 
policies  could  be  designed  to  better  identify  and  develop 
capable  individuals  for  positions  of  leadership. 

The  analysis  of  this  thesis  reduced  the  effects  of 
confounding  policies,  such  as  discriminatory  promotion  and 
accession  programs.  It  also  used  a  sufficiently  large  sample 
size,  which  allowed  the  averages  to  outweigh  the  exceptions. 
It  drew  on  data  from  standard  personnel  records,  and  made  the 


II .  A  REVIEW  OF  PREVIOUS  STUDIES 


The  topic  of  relating  intelligence  to  some  aspect  of 


performance  is  an  extensive  and  rich  area  of  study.  It  is  a 


particular  topic  of  interest  to  social  scientists  and 


military  manpower  specialists.  As  a  demonstration  of  the 


quantity  of  work  done  in  this  area,  a  simple  cross- 


referencing  of  the  words  intelligence  test  and  performance 


produced  a  list  of  237  citations  from  the  Lockheed's  DIALOG 


online  information  files. 


Restriction  of  available 


references  to  those  utilizing  military  intelligence  test 


scores  and  statistical  analysis  of  those  tests  relative  to 


some  performance  measure  still  results  in  a  large  number  of 


citations.  Within  this  restriction  there  is  a  variety  of 


study  methodologies.  The  source  of  a  study  can  originate 


from  an  in-house  military  analysis,  a  contracted  study  done 


by  a  commercial  analytical  institute,  or  an  academic 


institution  making  use  of  military  data  as  its  media  for 


analysis . 


The  nature  of  the  data  is  also  varied.  Several  studies 


readminiatered  the  ASVAB  tests  to  a  selected  test  population. 


other  studies  used  IQ  and  other  intelligence  measures  in 


addition  to  the  ASVAB.  The  performance  side  of  the 


relationship  had  an  extensive  number  of  dependent  variables. 


Examples  of  performance  measures  were:  results  of  written 


exams 


military 


skills  test  results 


minority  advancement. 


and  comparison  to  collegiate  ACT,  PSAT,  and  SAT  tests. 

This  chapter  will  review  four  of  the  most  closely 
related  studies,  concentrating  for  each  one  on: 

1.  The  objective  of  the  study. 

2.  The  methodology  used  in  analysis. 

3.  The  conclusion  reached. 

The  first  analysis  is  from  Are  Smart  Tankers  Better? 
AFQT  and  Military  Producti vi tv . C Ref .  8]  This  study  is 

essentially  an  in-house  military  analysis,  the  authors  being 
Army  officers  assigned  to  the  Office  of  Economic  and  Manpower 
Analysis,  at  West  Point,  New  York.  As  described  in  the 
title,  the  paper  presents  the  results  of  an  investigation  in 
which  the  crews  of  tanks  were  scored  on  their  ability  to 
destroy  targets  on  live  fire  ranges.  The  AFQT  score  of  the 
gunner  and  tank  commander  was  one  of  several  explanatory 
variables,  having  the  tank  scores  as  the  dependent  variable. 
The  analysis  methodology  used  a  log-log  production  model  with 
ordinary  least  squares  regression. 

The  result  of  their  analysis  is  best  summarized  in  this 
paragraph  from  the  study: 

"That  there  exists  a  positive,  statistically 
significant  relationship  between  AFQT  and  performance,  is 
a  powerful  result.  The  coefficients  on  the  model  means 
that  if  we  move,  for  example,  from  the  AFQT  score  for  an 
average  Category  IV  TC  to  the  AFQT  score  for  an  average 
Category  IIIA  TC ,  (a  200%  increase),  we  will  increase  the 
performance  on  Table  8  (the  tank  scoring  exercise)  by 
approximately  20.3%." 


In  this  study  then,  AFQT  was  found,  by  means  of  least  squares 
regression,  to  have  a  definitive  relationship  to  a  well- 
defined  skill  measure,  the  conduct  of  tank  firing. 

The  second  study  is  an  analysis  done  at  the  University  of 
Iowa  by  the  Cada  Research  Group  titled:  On  Predicting 

Success  in  Training  for  Males  and  Females; _ Marine  Corps 

Clerical  Specialties  and  ASVAB  Forms  6  and  7. [Ref  9]  This 
report  uses  the  ASVAB  score  as  an  explanatory  variable  for 
success  of  recruits  in  training.  The  methodology  used  is 
primarily  regression;  however,  the  scope  of  the  regression 
concentrates  on  identifying  differences  between  male  and 
female  performance.  The  implicit  result  in  the  study's 
discussion  of  the  sex  score  differences  is  that  the 
regressions  performed  for  each  category  was  of  useful 
predictive  value.  An  interesting  note  about  this  study  was 
that  the  inclusion  of  high  school  completion  reduces  the 
difference  between  the  male  and  female  regression 
coefficients . 

The  third  study  is  a  section  of  articles  used  in  the 
Report  to  the  House  and  Senate  Committess  on  Armed  Services, 
Defense  Manpower  Quality.  Volume  II,  Army  Submission. 

[Ref.  10]  The  section  of  interest  to  this  thesis  was  a  study 
done  by  the  U.  S.  Army  Training  and  Doctrine  Command  (TRADOC) 
Systems  Analysis  Activity  (TRASANA).  The  study  uses  AFQT,  as 
well  as  education  level,  sex,  paygrade,  time  in  service,  tjme 


in  Military  Occupational  Specialty  ( MOS ) 


and 


a  dummy 


variable  reflecting  General  Equivalency  Diploma  CGED) 
completion  as  explanatory  variables.  GED  is  a  rating  given 
to  individuals  who  did  not  graduate  from  high  school,  but  who 
have  taken  examinations  to  be  rated  as  equivalent  to  a  high 
school  graduate.  A  battery  of  tests  given  under  controlled 
conditions  resulted  in  a  net  score  which  was  made  the 
dependent  variable.  The  battery  of  tests  was  designed  so  as 
to  represent  how  proficient  a  soldier  was  in  his  specific 
career  field.  The  test  included  a  written,  as  well  as  hands- 
on  proficiency  test. 

The  analysis  method  used  was  linear  regression,  with  the 
inclusion  of  a  Durbin  Instrument  as  a  correction  tool  for 
AFQT .  The  results  are  again  best  summarized  from  the  report : 


"The  most  important  result  is  that  AFQT  Category  I-IIIA 
soldiers  performed  approximately  10%  better  overall  than 
1 1 1 B  soldiers.  .  .  Furthermore,  PFQT  was  a  much  more 
important  influence  on  performance  in  virtually  all 
instances  than  either  education  or  experience,  whether 
measured  in  terms  of  time  in  service,  MOS,  or  unit. 
Thus,  these  results  strongly  support  the  validity  of  AFQT 
as  a  predictor  of  performance  in  these  military 
occupational  specialties." 

This  report  then,  is  very  similar  in  conclusion  to  the 
tank  gunnery  report,  in  which  AFQT  was  shown  through 
regression  to  have  a  significant  and  measurable  effect  on 
soldier  performance  in  skill  related  tasks. 

The  last  study  reviewed  is  also  from  the  collection  found 
in  the  Defense  Manpower  Study.  [Ref.  11]  The  topic  for  this 
study  was  the  estimation  of  promotion  rate.  It  is  presently 
the  most  similar  study  to  the  central  theme  of  this  thesis. 


25 


Using  AFQT  as  one  of  the  independent  variables,  a  duration 
model  is  applied  to  estimate  the  expected  speed  of  promotion. 
This  model  was  applied  within  two  categories,  the  paygrade 
and  the  career  field  of  the  NCOs .  This  promotion  estimation 
study  approaches  the  aggregation  of  data  in  a  different 
manner  as  well.  Specifically,  by  evaluating  the  possibility 
of  promotion  for  each  individual  over  a  series  of  years,  the 
dimension  of  time  was  entered  into  analysis.  A  significant 
advantage  of  including  the  time  dimension  was  that  changes  in 
the  categorical  levels  of  the  population  could  be  accounted 
for,  such  as  race  or  sex. 

The  methodology  used  in  the  promotion  estimation  study  i3 
considerably  more  complex  than  in  the  previous  studies. 
Rather  than  using  standard  regression  models,  the  study  uses 
the  Generalized  Linear  Model  form.  Specifically,  the  form  of 
the  predictive  model  is  a  log  likelihood  function  using  the 
Wtibull  shape  parameter.  The  explanatory  variables  include 
education,  AFQT,  marital  status,  race,  number  of  dependants, 
time  in  service,  sex,  and  high  school  completion  status.  By 
using  the  Weibull  model,  the  application  of  explanatory 
variables  which  are  not  continuous,  such  as  sex,  high  school 
completion  status,  and  marital  status  is  more  proper. 
Additionally,  there  are  no  requirements  for  the  normality 
assumptions  for  the  residuals,  and  therefore,  less 
subjectivity  to  the  appropriateness  of  the  model  with  respect 
to  the  independent  variables.  This  method,  however,  does  not 


consider  any  in-service  information  and  was  calculated  only 


for  very  specific  CMF  and  Paygrade  combinations.  The  results 
are  summarized  as  follows: 

"A  review  of  these  promotion  results  reveals  two 
trends.  First,  even  after  controlling  for  high  school 
diploma  status,  AFQT  Category  I-IIIA  soldiers  are 
promoted  approximately  10X  more  rapidly  than  1 1  IB 
soldiers.  Second,  high  school  completion  is  less 
important  than  AFQT  score  in  determining  promotion  rates. 
The  remarkable  aspect  of  this  last  result  is  that 
educational  attainment  is  an  explicit  part  of  the  Army'3 
promotion  point  system,  while  AFQT  scores  are  not.  These 
trends  are  true  for  both  promotion  to  E-5  and  promotion 
to  E-6." 

As  considerable  attention  has  already  been  given  to  the 
topic  of  relating  measures  of  intelligence  to  performance, 
and  since  positive  results  have  generally  been  the  result, 
one  might  wonder  why  another  study  should  be  undertaken. 
First,  this  thesis  is  in  response  to  a  request  by  the  Office 


of  the  Deputy  Chief  of  Staff  for  Personnel  (ODCSPER)  for 
further  research  in  the  relationship  of  AFQT  to  success  in 
the  Army.  Secondly,  this  thesis  will  be  different  in  its 


approach  and  analytical  procedures.  Following  is  a  list  of 
the  unique  characteristics  of  this  thesis: 

1.  The  perspective  of  this  thesis  is  that  the  results  will 
be  used  as  a  management  tool,  or  as  an  explanatory 
method  for  active  duty  Army  personnel.  In  that  light, 
the  study  utilizes  information  collected  from  the 
individual's  in-service  record,  such  as  his  Skill 
Qualif ication  Scores,  and  hi3  NCO  Schooling  levels. 
Similar  to  accession  related  studies,  this  analysts 
includes  intelligence,  academic,  and  categorical 
information  as  potential  explanatory  variables. 
However,  the  intent  is  not  to  justify  accession  of  high 
quality  soldiers,  but  to  investigate  the  trends  of 
promotion  for  active  duty  personnel  as  a  function  of 
available  personnel  data. 


2.  This  study  conducts  significant  investigation 

into  the  data  to  identify  and  correct  anomalies  which 
would  confound  the  relationship  in  question. 

3.  Statistical  analysis  is  done  from  the  bottom  up, 
rather  than  by  direct  movement  into  regression  models. 
This  approach  finds  that  strict  parametric  models  are 
subject  to  error  due  to  the  inability  of  some  data 
variables  to  meet  distributional  assumptions  necessary 
for  parametric  analysis.  The  study  then  moves  to 
nonparametr ic  means  to  approach  the  issue. 

4.  For  regression  models,  given  the  cautions  on  their  use, 
an  additional  sample  population  is  tested  using  the 
model.  Thus,  the  results  from  the  initial  model  can  be 
considered  to  have  more  believability  and  fidelity  than 
a  model  based  on  analysis  of  a  single  population 
sample . 

5.  The  use  of  a  large  data  set.* 

6.  Several  explanatory  variables  have  been  made 
available  from  the  DMDC  data  base  which  have  not  been 
used  in  previous  studies.  They  include  the  initial 
education  at  time  of  entry,  NCO  education  level,  and  a 
race  variable  with  six  categories. 

7.  The  choice  of  promotion  as  the  dependent  variable 
rather  than  a  set  of  performance  tests.  Although  prone 
to  more  uncertainty  than  results  of  performance  tests, 
promotion  is  in  many  ways  an  ultimate  performance 
measure.  The  service,  like  any  other  organization, 
recognizes  superior  performance  by  promoting  and 
advancing  individuals  to  higher  positions  of  authority. 
As  such,  promotion  rate,  despite  its  problems,  has  a 
strength  of  recognition  well  beyond  that  of  technical 
performance . * 

8.  This  study  uses  graphical  methods  for  depiction  of  many 
of  the  methods  of  analysis. 


♦Study  number  four  from  Defens* 


Manower  Study  uses  both 
independent  variable. 


Ill .  OVERVIEW  OF  THE  DATA 


A.  INTRODUCTION 

A  critical  aspect  of  this  thesis  was  the  selection  and 
screening  of  data.  Two  general  guidelines  were  applied  in 
creating  the  data  set.  First,  the  data  set  had  to 
demonstrate  a  level  of  homogeneity  in  that  the  NCO's 
considered  would  all  have  served  under  similar  enlistment  and 
advancement  policies.  Secondly,  the  selection  of  individual 
records  needed  to  be  random  and  without  unintentional  bias  to 
meet  the  requirements  for  a  representative  sample  set. 
Section  III  C.  describes  in  detail  the  measures  taken  to 
insure  that  the  above  two  attributes  were  established  in  the 
study  data  set. 

Recoding  of  data  values  into  numerical  equivalents  was 
required  for  several  personnel  record  fields.  As  an  example, 
the  level  of  Military  Schooling,  which  is  the  NCO’s  in- 
service  schooling  level,  was  recorded  as  mixed  alpha-numeric 
characters.  Transformation  involved  rank  ordering  the 
available  levels  of  schooling  in  ascending  hierarchical  order 
and  substituting  a  numeric  value  for  the  alpha-numeric  value. 
Chapter  IV  discusses  in  detail  the  background  of  each 
variable.  Finally,  as  a  check  on  the  effects  of  manipulating 


and  restricting  the  sample  data  set,  section  III  D.  provided 
a  comparison  of  statistics  for  the  entire  U.S.  Army  NCO 
database,  versus  the  sample  data  set  used  in  this  thesis. 


B.  DESCRIPTION  OF  THE  VARIABLES 

Th«  data  variables  used  in  this  study  fall  into  three 
categories:  control  variables,  intelligence  variables,  and 
promotion  variables.  The  first  two  categories,  control  and 
intelligence,  were  used  as  explanatory  variables,  while  the 
promotion  variables  were  used  as  the  dependent  variables.  A 
brief  description  of  each  variable  is  tabulated  in  Table  I. 


TABLE  I 

Summary  of  Variables  in 

Sample 

Variable  Cateaorv 
Dependent 

Meanina 

Value  Scale 

PRATE 

Promotion 

Raw  Promotion  Rate: 
number  of  promotions 
per  month  to  most 
recent  promotion 

041-.21  Ratio 

RATE 

Promotion 

Promotion  rate  difference 
from  average  for  that 
paygrade  (normalized) 

2. 2-9. 4  Ratio 

PRA  Promotion 

Explanatory 

Promotion  rate  difference 
from  average  for  that 
paygrade  and  CMF 
( normalized) 

3. 4-8.0  Ratio 

SEX 

Control 

Male/Female 

0/1  Nominal 

CMF 

Control 

Career  Management  Field 

11-99  Nominal 

RACETH 

Control 

Race/Ethnic  group 

1-5  Nominal 

PAYGD 

Control 

Paygrade 

5-7  Ordinal 

GTSCR 

I ntell 

General  Intelligence 

Score 

0-160  Ordinal 

AFQTP 

Intell 

Armed  Forces 

Qualification  Test  Score 
Percentile 

1-100  Ordinal 

OAFOTP 

Intell 

Same  as  AFQTP,  referenced 
on  1980  population 

1-100  Ordinal 

EIMCAT 

Intell 

Mental  Category;  based 
on  OAFQTP 

1-8  Ordinal 

HIYRED 

Intell 

Highest  Year  of  Education 
upon  entry  into  Army 

1-12  Ordinal 

EDLVL 

Intell 

Present  Education  Level 

1-12  Ordinal 

NCOE 

Intell 

Military  Education  Level 
Attained 

0-13  Ordinal 

PQSCR 

Intell 

Army  Proficiency  Test 

0-100  Ratio 

30 


A  more  detailed  description  of  each  of  the  study 
variables  will  be  given  in  the  first  part  of  Chapter  IV, 
Successive  Analysis. 

C.  PREPARATION  OF  THE  DATA 

Preparation  of  the  data  began  with  acquiring  fifty 
thousand  records  from  the  U.S.  Army  Military  Personnel  Center 
in  Alexandria,  Virginia.  Initial  restrictions  on  the  data 
were  established  to  allow  inclusion  of  only  NCO's  with  a  date 
of  entry  after  January  1,  1976.  Further,  NCO's  selected  had 
to  be  members  of  the  Regular  Army,  and  not  Reserve  or 
National  Guard  forces.  These  restrictions  provided  for 
observation  of  only  those  NCO's  who  were  recruited  a 
reasonable  time  period  following  the  ending  of  the  Viet  Nam 
War,  and  following  the  establishment  of  the  All-Volunteer 
Force.  Restricting  the  NCO's  to  Regular  Army  soldiers 
focused  the  study  on  the  standing  forces  alone,  and  avoided 
confounding  as  a  result  of  different  promotion  and  accession 
policies  in  the  Reserve  and  Guard  Forces. 

The  records  requested  were  randomly  drawn  by  taking  every 
fifth  individual  from  an  estimated  population  of  250,000 
meeting  the  above  restrictions.  The  fifty  thousand  MILPERCEN 
records  were  then  matched  and  merged  with  a  similar  personnel 
database  from  the  Defense  Management  Data  Center  (DMDC) 
Monterey,  California.  The  DMDC  database  holds  additional 
information,  including:  the  ability  to  distinguish  high 
school  equivalent  certificates  holders  from  actual  graduates. 


the  highest  year  of  education  of  the  soldier  at  time  of 
enlistment,  and  AFGTP  and  EIMCAT  scores  renormed  for  a  1980 
population . 

After  the  merging,  data  records  which  had  missing  values 
in  any  of  the  critical  variables  fields  were  dropped.  There 
were  approximately  ten  thousand  records  missing  critical 
data.  Following  initial  analysis  of  promotion  rates,  two 
additional  restrictions  were  applied  against  the  remaining 
records . 

First,  a  grouping  of  several  hundred  promotion  rates 
showed  that  individuals  had  been  promoted  to  the  rank  of  E-5 
at  rates  which  were  as  high  as  one  promotion  per  month. 
Cross  referencing  of  service  numbers  identified  this  sub¬ 
group  as  NCO's  who  had  served  in  Reserve  or  Guard  units  and 
who,  for  a  variety  of  reasons,  had  been  called  for  active 
duty.  As  such,  they  were  allowed  by  regulation  to  carry  with 
them  an  accelerated  promotion  to  their  former  rank. 
Subsequently,  a  serial  number  match  and  elimination  was  done 
for  all  NCO's  with  recent  listing  as  Reserve  or  Guard  status. 

A  second  source  of  unusual  promotion  rates  at  the  E-5 
level  became  apparent  in  some  of  the  more  technically 
oriented  career  management  fields,  the  medical  field  in 
particular.  Research  into  Army  special  recruitment  policy 
indicated  that  during  the  early  1980’s  special  provisions 
were  made  to  allow  persons  with  background  ability  in  certain 
technical  fields  to  enter  the  Army  and  be  promoted  to  NCO 


sta 'us  within  six  months,  or  in  certain  cases  to  receive  NCO 
status  immediately  following  basic  training.1  To  correct  for 
these  anomalies,  all  promotion  rates  which  fell  outside  the 
maximum  time  periods  considering  application  of  both  waivers 
were  discarded. 

D.  COMPARISON  TO  TOTAL  ARMY  STATISTICS 

In  this  section,  selected  attributes  of  the  sample  data 
set  and  the  complete  U.S.  Army  database  are  briefly  compared, 
with  the  intent  of  checking  the  representativeness  of  the 
sample  set. 

Population  attributes  such  as  distribution  of  sex.  Career 
Management  Fields,  and  paygrade  were  obtained  from  the 
complete  U.S.  Army  database  records  consisting  of  over 
250,000  NCO's. 

As  described  in  paragraph  3.B,  the  sample  data  set  of 
50,000  selected  records  had  been  filtered  to  contain  only 
personnel  who  entered  the  Army  after  1976.  Screening  of 
those  50,000  records  for  completeness  of  data  and  uniformity 
of  promotion  policy,  reduced  the  number  in  the  sample  set  to 
approximately  38,000.  It  was  prudent  then,  to  check  the 
final  sample  set  to  see  if  it  retained  its  representative 
character  as  a  random  sample.  It  should  be  noted,  however, 
that  this  comparison  will  not  occur  for  all  study  variables. 

1  MSG  Knopp,  NC0IC  Defense  Management  Data  Center,  West. 
El  Estero  Drive,  Monterey  CA  93946. 


Reasons  for  this  include  non-availability  of  records  from  the 


MILPERCEN  database,  and  cases  where  the  statistic  was 
produced  through  computation  by  the  author,  promotion  rates 
being  the  principal  example. 

1 .  Comparison  of  Army  versus  Sample  Summary  Statistics 

Formal  hypothesis  testing  for  means  or  distributions 
with  ANOVA  was  unavailable  due  to  computational  and  software 
restrictions.  However,  since  the  intent  of  this  section  was 
simply  to  identify  any  population  shifts,  and  the  magnitude 
of  those  shifts,  observation  of  summary  statistics  is  assumed 
to  be  sufficient.  Specifically,  the  means  and  the  standard 
deviations  of  four  variables  were  obtained  from  both  the 
entire  NCO  population  data  set  and  the  thesis  sample  data 
set.  The  percent  difference  between  the  variable  means  was 
computed  and  expressed  relative  to  the  thesis  sample  data.  A 
table  of  comparative  statistics  and  the  percent  difference  is 
shown  in  Table  II. 


TABLE  II 

Total  Army  vs 

Sample 

Summary 

Statistics 

Total 

Armv 

Sample 

Sample  Size 

(250, 

000) 

(37,854) 

Percent 

Variable 

Std  Dev 

Mean 

Std  Dev 

Difference 

AFQTP 

48.3 

25.2 

53.4 

20.9 

Sample  10% 

> 

SEX 

1.09 

.283 

1 .  12 

.  328 

Sample  2.7% 

> 

RACETH 

1.63 

.991 

1.65 

.  942 

Sample  1.2% 

> 

PAYGD 

5.75 

.597 

5.27 

.464 

Sample  5.2% 

< 

The  three  variables  AFQTP,  SEX,  and 


noticeable 
while  the 


changes  between  the  Sample  and  the 
RACETH  variable  doesn't  appear  to 


PAYGD 

Total 

have 


have 
Army , 
been 


affected  much  by  sampling.  A  closer  look  at  the  discrete 
distributions,  and  an  overall  conclusion  about  differences  in 
the  two  data  sets  follows. 

2 .  Discrete  Distributions 

Figures  3.1  and  3.2  illustrate  differences  in  the 
discrete  distributions  for  paygrade  and  race  respectively. 
Both  plots  are  Clustered  Bar  Charts,  and  the  percentage  of 
each  level  of  the  discrete  variable  for  both  the  Total  Army 
and  the  Sample  were  plotted  next  to  each  other. 


ARMY  VS  SAMPLE  PAYGRADE  PERCENTAGES 


ARMY  VS  SAMPLE  RACE  PERCENTAGES 


Figure  3.2 

Observation  of  the  tabular  data  and  bar  charts  show 
that  there  are  some  differences  between  the  two  populations. 
Specifically,  the  sample  contains  more  lower  ranking 
personnel,  slightly  more  women,  and  significantly  higher 
AFQTP  related  scores.  The  racial  make-up  of  the  sample 
appears  to  be  similar. 


The  restriction  of  random  sampling  to  only  those  persons 


explain  these  differences.  First,  the  lower  average  paygrade 
is  a  direct  result  of  promotion  policy,  in  which  it  is 
impossible  to  achieve  a  rank  above  E-7  in  less  than  ten 
years.  Hence,  the  sample  population  should  be  demonstrate  a 
lower  average  paygrade.  Secondly,  the  slight  increase  in  the 
proportion  of  women  might  be  explained  by  a  general  opening 
up  of  the  services  to  women  in  the  late  seventies  and  early 
eighties.  Thirdly,  the  higher  AFQTP  is  a  direct  result  of 
policy  restrictions  begun  in  Fiscal  Year  1981,  and  formalized 
by  the  1984  Defense  Authorization  Act.  This  placed  quality 
constraints  on  AFQT  Category  and  high  school  diploma  status. 
[Ref.  10:sec  1-0,  p.ll  Whether  these  restrictions,  or  the 
general  improvement  of  social  acceptance  of  the  military 
services  resulted  in  this  AFQT  improvement  is  a  question 
which  would  require  significant  study  in  itself. 

In  short  then,  the  sample  is  different  in  several  ways 
from  the  total  NCO  population.  It  should  be  noted,  however, 
that  these  results  are  intentional.  The  shifts  caused  by 
restricting  the  sample  to  after  1976  are  felt  to  be  less 
dangerous  to  the  study  than  the  alternative  of  including 
soldiers  who  were  accessed  during  the  draft  and  the  era  of 
Viet  Nam  War  policies.  Finally,  it  is  only  a  matter  of  time, 
unless  significant  changes  in  accession  and  promotion  policy 
occur,  before  the  character  demonstrat  d  by  the  sample  data 
set  will  constitute  the  norm  for  all  NCOs .  Thus,  it  is 
concluded  that  the  study  sample  is  satisfactory. 


IV.  SUCCESSIVE  DATA  ANALYSIS 


A.  INTRODUCTION 

In  this  chapter  the  results  of  a  systematic  method  for 
data  analysis  will  be  reported.  This  method  of  analysis 
followed  a  format  which  is  described  by  Chambers  in  Graphical 
Methods  for  Data  Anal vs  is .[ Ref .  12]  This  procedure  develops 

an  understanding  of  the  data,  beginning  with  simple 
univariate  descriptive  procedures,  then  progressing  through 
several  increases  in  dimensionality  of  variables,  and  finally 
into  the  more  complex  inferential  procedures  of  model 
building  and  multivariate  regression.  An  abbreviated  outline 
of  this  procedure  is  shown  below. 

1.  Analysis  of  single  variables. 

2.  Comparison  of  variable  distributions. 

3.  Analysis  of  paired  variables. 

4.  Multivariate  graphical  analysis 

5.  Linear  Models  including: 

a.  Simple  Regression 

b.  Multivariate  Models 

In  addition  to  these  steps,  this  procedure  will  be 
supplemented  with  several  non-graphical  measures,  such  as 
ANOVA,  ANCOVA,  and  several  tabular  nonparametr ic  methods.  It 
should  be  noted  that  this  analysis  reports  only  those 
procedure*  which  are  considered  an  essential  step  in 
investigation,  or  whose  results  provided  an  observation  of 
merit.  Many  available  procedures  have  not  been  used  in  thi3 


chapter 


as  a  consequence  of  the  data  failing  to  meet 


distributional  assumptions,  and  for  other  reasons  which  would 


make  such  analysis  inappropriate.  During  the  development  of 
this  chapter,  the  results  of  each  level  of  analysis  will 
specify  why  the  next  set  of  analysis  procedures  was  pursued. 
Alternatively,  if  a  popular  class  of  procedures  is 
disregarded,  the  logic  for  disregarding  is  explained. 

The  objective  of  detailing  this  procedure  is  to  present  a 
thorough  depiction  of  the  nature  of  the  variables,  and  to 
explain  the  development  of  resulting  inferences  and  models. 


B.  UNIVARIATE  ANALYSIS. 

1 •  Dependent  Variables 
a.  PRATE 

(1)  General .  The  variable  PRATE  represents  the 
raw  promotion  rate  of  a  particular  individual.  Numerically, 
it  is  the  total  of  promotions  per  month  up  to  the  most  recent 
promotion . 

(2)  Value .  The  variable  PRATE  was  computed 
using  data  obtained  from  the  DMCD  database.  The  time  to  most 
recent  promotion  in  months  was  found  by  subtracting  the  basic 
pay  entry  date  from  the  date  of  latest  award  of  rank.  This 
number  then  became  the  denominator  of  a  ratio  having  the 
individual's  rank,  or  equivalently,  the  total  number  of 
promotions  the  individual  has  received,  as  the  numerator: 


Prate 


Individual ' s  Latest  Rank 


(Award  Date  of  Latest  Rank)  -  (Date  of  Entry  in  Armv) 


V  V 


Ranks  were  numerically  represented  with  a  score  of  5  for 
an  E-5  Sergeant,  and  with  6  and  7  for  values  of  the  next  two 
ranks.  The  resulting  units  of  measurement  for  the  PRATE 
variable  were:  units  of  promotion  per  month  of  service. 

(3)  Attributes  of  the  Variable.  The  variable 
PRATE  qualifies  as  a  continuous  variable  with  a  ratio  scale. 
The  continuous  nature  of  the  variable  relies  on  the  fact  that 
the  number  of  months  service  combined  with  three  rank 
structures  yields  sufficient  combinations  of  values,  actually 
190  in  all,  to  use  as  measures. 

There  are  some  inherent  problems  with  the  raw  PRATE 
score,  since  promotion  policies  are  in  effect  which  set 
minimum  time  thresholds  for  promotion.  Thu3,  the  promotion 
of  an  individual  who  is  presently  an  E-5  will  be  incomparable 
to  the  promotion  rate  of  an  E-7  whose  three  promotions  have 
been  affected  by  the  minimum  time  policy.  Generally,  the 
minimum  time  in  service  between  promotions  grows  as  rank 
increases,  and  more  senior  soldiers  will  normally  have  lower 
raw  promotion  rates. 

A  second  source  of  bias  is  potentially  found  in  the 
Career  Management  Field  (CMF)  of  the  soldier.  Army  promotion 
policy  is  based  on  a  system  of  minimum  performance  points  to 
be  attained  within  a  CMF  in  order  to  be  considered  for 
promotion.  Generally,  the  more  technical  fields  will  have 
higher  promotion  point  thresholds  than  non-technical  fields. 

The  distribution  of  the  variable  PRATE  and  its  summary 
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statistics  are  shown  in  Figure  4.1.  The  shape  of  the 
histogram  is  positively  skewed,  demonstrating  a  steep 
ascending  slope  in  the  first  partitions,  then  a  generally 
flat  shape  until  just  past  the  median  value.  After  the 
median  value,  a  gradual  downward  sloping  tail  occurs.  A 
rough  interpretation  of  this  shape  is  that  there  appears  to 
be  a  few  individuals  who  are  promoted  at  very  fast  rates, 
followed  by  a  block  of  average  promotion  rates,  then  a 
diminishing  tail  of  individual  promotion  rates  which  fall  to 
the  right  of  the  seventy-fifth  percentile. 


HISTOGRAM  TABLE 
X  : PRATE 

SELECTION  : ALL 

X  LABEL  -.PRATE 

NO.  OF  ELEMENTS  : 37854 

X  MEAN  : 0.10940 

STD.  DEVIATION  :0. 036322 

SKEWNESS  : 0.59367 

KURTOSIS  : 2. 5854 

5-PERCENTILE  :0. 061225 

25-PERCENTILE  :0.08 

MEDIAN  : 0.10204 

75-PERCENTILE  : 0.13514 

95-PERCENTILE  :0. 17857 

X  MIN.  :0. 041667 

X  MAX.  :0. 20833 


F i gure  4 . 1 

Distribution  transformation  of  this  variable  was  not 


attempted,  primarily  because  its  usefulness  in  testing  or 
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PRATE 


modelling  is  limited  by  the  problems  associated  with  the  bias 


b. 


RATE 


(1)  General .  The  variable  RATE  is  a  re¬ 
expression  of  the  variable  PRATE.  It  has  bia3  due  to 
individual  rank  removed  by  normalizing  each  individual  score 
relative  to  his  or  her  paygrade. 

(2)  Values .  To  compute  the  variable  RATE,  the 
average  PRATE  value  for  each  paygrade  was  calculated,  as  well 
as  the  standard  deviation  for  that  paygrade.  Individual 
scores  were  then  normalized  by  the  transformation: 

RATEi  =  PRATEi  -  AVERAGE  for  that  Rank 
STANDARD  DEVIATION  THAT  RANK 

( 3 )  Attributes  of  the  Variable.  The  variable 
RATE  is  also  a  continuous  ratio  scale  variable,  as  it  is  a 
transformation  of  PRATE. 

The  removal  of  influence  due  to  rank  was  confirmed  by 
computing  the  correlation  coefficient  between  the  variables 
RATE  and  PAYGD.  As  seen  in  Table  X,  a  value  of  near  zero 
resulted  where  the  previous  correlation  coefficient  for  PRATE 
and  PAYGD  had  been  -.495.  Thus,  the  transformation  to  RATE 
from  PRATE  results  in  a  variable  independent  of  PAYGD. 

The  distribution  shape  of  the  RATE  histogram,  shown  in 
Figure  4.2,  appears  slightly  non-normal,  but  a  check  of  the 
summary  statistics  for  quantiles  show  that  they  correspond 
closely  to  the  standard  normal  quantiles.  Thus,  the 
assumption  of  normality  for  procedures  using  this  variable  is 


still  reasonable,  based  on  observation  of  the  distribution 
shape  and  the  close  agreement  of  quantile  values. 

Figure  4.2  presents  a  histogram  and  summary  statistics  for 
the  RATE  variable. 
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HISTOGRAM  TABLE 


X 

RATE 

SELECTION 

ALL 

X  LABEL 

RATE 

NO.  OF  ELEMENTS' 

37854 

X  MEAN 

-1 .565E-6 

STD.  DEVIATION 

0.99997 

SKEVMISS 

0.21408 

KURTOSIS 

2.3767 

5-PERCENTILE 

-1 .5476 

25-PERCENTILE 

-0.77578 

MEDIAN 

-0 . 03757 

75-PERCENTILE 

0.70754 

95-PERCENTILE 

1.6234 

X  MIN. 

“2.2681 

X  MAX. 

3.6685 

Figure  4.2 


c.  PRA 

(1)  General .  The  variable  PRA  is  another 
recomputation  of  the  raw  promotion  rate.  PRA  controls  for 
the  career  management  field  as  well  as  paygrade.  It  is  set 
of  normalized  promotion  scores,  which  are  independent  of 
PAYGD  and  CMF.  Verification  of  the  independence  of  PRA  from 
these  variables  was  also  confirmed  by  checking  correlation 
coefficients.  Both  variables  CMF  and  PAYGD  had  near  zero 
values  of  correlation  with  PRA. 

(2)  Values .  Computing  the  variable  PRA  was  done 


in  the  same  manner  as  in  RATE,  however  a  mean  and  standard 


deviation  for  each  CMF  and  PAYGD  combination  was  computed  and 
used  in  the  normalization  equation. 

(3)  Attributes .  PRA  is  a  continuous  variable 
with  a  ratio  scale.  The  distribution  of  PRA  appears  normal, 
with  the  quantile  values  very  close  to  the  standard  normal. 
A  comparison  of  percentile  values  for  PRA  versus  the  standard 
normal  are  shown  in  TABLE  III. 
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PRA 


HISTOGRAM  TABLE 


X 

PRA 

SELECTION 

ALL 

X  LABEL 

PRA 

NO.  OF  ELEMENTS 

37354 

X  MEAN 

7.41E-9 

STD.  DEVIATION 

0 . 99881 

SKEWNESS 

0.21406 

KURTOSIS 

2.6652 

5-PERCENTILE 

"1.5518 

25-PERCENTILE 

"0.75252 

MEDIAN 

“0.04146 

75-PERCENTILE 

0.69604 

95-PERCENTILE 

1 . 7086 

X  MIN. 

-3.4988 

X  MAX. 

4.5374 

Figure  4 . 3 

A  comparison  of  percentiles  for  the  PRA  distribution 
versus  the  standard  normal  distibution  is  shown  in  Table  III. 
Specifically,  the  PRA  percentile  values  are  listed  with  the 
corresponding  standard  normal  percentile  values  for  the  same 
data  point.  For  example,  -1.5510  is  the  PRA  five  percentile, 
while  a  -1.5510  indexed  in  a  standard  normal  table  results  in 


a  six  percent  value. 


TABLE  III.  Comparison  of  PRA  vs 
Standard  Normal  Percentiles 


PRA  Standard  Normal 

5%  6% 

25%  22.6% 

50%  48.4% 

75%  75.7% 

95%  96.3% 

Normality  for  this  variable  will  be  assumed  based  on 
general  distribution  shape  and  the  close  correspondence  of 
the  data  percentiles  to  the  standard  normal  percentiles. 

2 .  Control  Variables 

d.  SEX 

The  variable  SEX  is  discrete  and  nominal.  Males 
are  represented  by  a  numerical  value  of  one,  and  females  are 
represented  with  a  two.  In  the  study  sample,  12.29  percent 
of  the  sample  was  female,  and  87.71  percent  were  male. 

e.  CMF 

Career  Management  Field  (CMF)  is  a  discrete 
variable  with  nominal  scale.  Thirty  three  CMF's  are 


combat  branch,  auch  as  Infantry  or  Armor.  Center  CMF  values 
are  indicative  of  combat  support  branches,  such  as  Signal  and 
Chemical.  Upper  CMF  values  are  from  the  combat  service 
support  branches,  such  as  Medical  and  Language  Specialist. 

Figure  4.4,  the  CMF  histogram,  does  reflect  the 
distribution  of  the  three  general  groupings  of  CMF  densities: 
combat,  combat  support,  and  combat  service  support.  The 
combat  and  combat  support  values  have  roughly  equivalent 
representation,  while  the  upper  numbered  service  support 
CMF's  are  about  two  thirds  the  size  of  the  other  groups. 
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Figure  4.4 

f.  RACETH 

The  race-ethnic  variable  is  a  discrete,  nominal 
variable.  The  values  represented  and  their  percentages  are 
shown  in  table  IV. 
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TABLE  IV  Sample  Race 

Percentages 

Value 

Race 

Percent 

Cumulative 

Percent 

1 

White 

52.43 

52.43 

2 

Black 

38 . 59 

91 .02 

3 

Hispanic 

5.58 

96 . 6 

4 

American  I ndian /Alaskan 

Native  . 26 

96.86 

5 

Asian/Pacific  Islander 

1.15 

98 .01 

6 

Other/Unknown 

1 . 99 

100 . 00 

g.  PAYGD 

Paygrade  is  a  discrete,  nominal  variable.  The 
selection  of  NCO  rank  from  personnel  enlisting  after  1976 
resulted  in  representation  by  paygrades  E-5  through  E-7  only 
The  distribution  of  PAYGD  is  shown  in  Table  V. 


TABLE  V  Sample 

Paygrade  Percenta 

ges 

Value 

Rank 

Percentile 

Cumulative 

Percent 

5 

Sgt  E-5 

73 . 29 

73 . 29 

6 

Staff  Sergeant  E-6 

25 . 89 

99 . 19 

7 

SFC  E-7 

0 .81 

100.00 

The  0.81  percent  for  E-7  results  in  only  307  SFC's  in  the 
sample.  Despite  the  preponderance  of  representation  by  the 
other  ranks,  a  sample  size  of  307  for  the  E-7  rank  still 
allows  for  adequate  representation  of  that  subcategory. 


h. 


GTSCR 


3  . 


The  General  Intelligence  Teat  Score  (GTSCR)  of 
the  individual  is  a  continuous  variable  with  at  least  an 
ordinal  scale.  The  range  of  values  run  from  50  through  160. 
The  lower  value  of  50  represents  the  corresponding  minimum 
score  of  ASVAB  modules  that  would  allow  for  enlistment  in  the 
Army.  The  histogram  of  the  GTSCR  variable,  shown  in  figure 
4.5,  is  approximately  normal.  Checking  the  quantiles  shows  a 
larger  density  in  the  distribution  to  the  left  of  the  mean, 
with  rlightly  lower  values  for  quantiles  right  of  the  mean. 


GTSCR  HISTOGRAM  ANO  STATISTICS 


HISTOGRAM  TABLE 


X  : GTSCR 

SELECTION  -.ALL 

X  LABEL  : GTSCR 

NC  OF  ELEMENTS  : 37=54 
X  MEAN  : 108.23 

STD.  DEVIATION  : 14.275 

SKEWNESS  : 0.129 

K0RT0SIS  : 3. 3632 

5— PERCENTILE  :B4 

25-PERCENTILE  : 99 

MEDIAN  : 109 

75— “ERCENTILE  : 117 

95-PERCENTILE  : 130 

X  MIN.  : 54 

X  MAX.  : 156 


Figure  4.5 
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AFQTP 


i  . 

The  Armed  Forces  Qualification  Test  Percentile  is 
a  continuous  variable  with  ordinal  scale.  Its  value 
represents  the  relative  standing  of  an  individual's  test 
score  referenced  against  a  1944  population.  This  means  that 
an  individual's  raw  AFQT  score  is  compared  against  a  standard 
table  of  values  that  was  developed  in  1944.  This  table  of 
values  from  1944  was  designed  to  represent  the  distribution 
of  raw  AFQT  test  scores  for  the  entire  1944  American  youth 
population.  Hence,  a  resulting  individual  AFQT  score  is 
simply  the  corresponding  percentile  of  the  individual  raw 
AFQAT  score  relative  to  the  entire  1944  population  AFQT  test 
distribution . 

The  histogram  and  summary  statistics  for  AFQTP  are  shown 
in  Figure  4.6.  The  density  of  AFQTP  is  partially  symmetric 
about  the  mean.  The  lower  five  percent  quartile  is  at  a 
value  of  21,  demonstrating  the  restriction  applied  to  CAT  V 
and  VI  personnel  since  1980.  Use  of  the  AFQT  score  for  this 
study  is  primarily  for  comparative  reasons.  AFQT  cannot  be 
used  in  any  developed  model  since  scoring  against  the  1944 
reference  population  has  ceased.  As  will  be  seen  in 
subsequent  chapters,  AFQT  was  discarded  anyway  when  OAFQT 
proves  to  a  better  explanatory  variable. 
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HISTOGRAM  TABLE 
X  : AFQTP 

SELECTION  : ALL 

X  LABEL  : AFQTP 

NO.  OF  ELEMENTS  : 37854 

X  MEAN  : 53. 41 9 

STD.  DEVIATION  : 20. 965 

SKEWNESS  :0. 29913 

KURTOSIS  : 2.2128 

5-PERCENTILE  : 21 

25-PERCENTILE  :37 

MEDIAN  :50 

75-PERCENTILE  :68 

95-PERCENTILE  :91 

X  MIN.  :  10 

X  MAX.  : 99 


Figure  4.6 

j .  OAFQTP 

The  OAFQTP  variable  is  a  continuous  variable  with 


ordinal  scale.  It  is  fundamentally  the  same  as  the  AFQTP 
variable,  excepting  the  reference  for  measurement,  which  is  a 
1980  population.  The  distribution  for  OAFQTP  is  considerably 
more  dense  in  the  lower  values  than  AFQTP.  Explanation  of 
this  shift  can  be  seen  by  reviewing  the  transformation  tables 
in  Appendix  A  for  converting  1944-based  scores  to  1980 
scores.  The  transformations  for  values  below  80  result  in  a 


1944  based  score  to  be  reduced  in  almost  every  case.  The 
amount  of  reduction  varies,  but  it  can  be  as  much  as  four 


points.  Only  when  the  scores  go  above  85  are  there  any 
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HISTOGRAM  TABLE 
X  : OAFQTP 

SELECTION  : ALL 

X  LABEL  :OAFQT 

NO.  OF  ELEMENTS  : 37854 

X  MEAN  : 45. 31 9 

STD.  DEVIATION  : 24. 779 

SKEWNESS  : 0.531 39 

KURTOSIS  : 2. 1725 

5-FERCENTILE  :14 

25-PERCENTILE  : 25 

MEDIAN  : 41 

75-PERCENTILE  :64 

95-PERCENTILE  :92 

X  MIN.  :  1 

X  MAX.  :99 


Figure  4.7 

k.  E I MC AT 


EIMCAT  is  the  mental  category  of  an  individual 
based  on  the  1980  reference  population  AFQT  test  score. 
EIMCAT  is  a  discrete  and  ordinal  scale  variable.  The 


assignment  of  categories  is  a  Department  of  Defense  standard, 
and  is  a  common  reference  for  all  services.  The  breakdown  of 


values  is  as  follows: 


TABLE  VI 

Sample  Mental  Category 

Percentages 

Value 

Cateaorv 

AFQT 

Percent 

Cumulative 

Percent 

1 

Cat  V 

01-09 

.  33 

.  33 

2 

Cat  IV  C 

10-15 

6.736 

7.067 

3 

Cat  IV  B 

16-20 

9.788 

16.854 

4 

Cat  IV  A 

21-30 

19.187 

36.041 

5 

Cat  III  B 

31-49 

26.116 

62.157 

6 

Cat  III  A 

50-64 

13.053 

75.21 

7 

Cat  II 

65-92 

19 . 99 

95 . 2 

8 

Cat  I 

93-99 

4.8 

100.000 

A  histogram  of  the  EIMCAT  values  follows  in  Figure  4.8.- 


SAMPLE  EIMCAT  DISTRIBUTION 


Figure  4.8 

Observation  of  the  above  figures  demonstrates  more 
clearly  the  fact  that  categorization  into  EIMCAT  category  is 
not  evenly  distributed  across  the  scale  of  OAFQT  scores.  For 
example,  the  center  EIMCAT,  value  five,  spans  almost  twenty 
points,  while  EIMCAT  eight  contains  only  the  upper  seven 
point  scores.  EIMCAT  does  make  available  an  established, 
discrete  scale  measurement  representing  intelligence  test 
scores  for  use  in  appropriate  statistical  procedures. 


1.  HIYRED 

HIYRED  is  the  highest  year  of  education  held  by 
the  individual  upon  entry  into  the  army.  It  is  a  discrete 
and  ordinal  scale  variable.  The  values  and  distribution 
percentages  are  shown  on  the  next  page  in  Table  VII. 
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TABLE  VII  Sample  Highest 

Year  of  Education 

Value 

Cateaorv 

Percent 

Cumulative 

Percent 

1 

1-7  Years 

0.018 

0.018 

2 

8  Years 

0 . 153 

0 . 172 

3 

1  Year  High  School 

1 . 397 

1 . 569 

4 

2  Years  High  School 

4.7 

6 . 269 

5 

3-4  years  HS  (no  diploma) 

6 .935 

13.203 

5 . 5 

High  School  GED 

4.813 

18.017 

6 

High  School  Diploma 

71.274 

89.29 

7 

1  Year  College 

3.305 

92 . 595 

8 

2  Years  College 

3.453 

96.048 

9 

3-4  Years  College  (no  degree)  1.337 

97 . 385 

10 

College  Graduate 

2 . 560 

99.945 

11 

Masters  or  Equivalent 

0.05 

99 . 995 

12 

Doctrate  or  Equivalent 

0 . 005 

100.000 

m.  EDLVL 


EDLVL  is  the  present  level  of  education  for  the 
individual.  These  scores  are  related  to  HIYRED,  in  that  any 
education  taken  by  the  individual  subsequent  to  enlistment  is 
recorded  in  this  variable.  A  GED  equivalency  is  included  as 
a  value  of  six  for  high  school  completion. 


TABLE  VIII  Sample  Education 

Level  Percentages 

Value 

Cateaorv 

Percent 

Cumulative 

Percent 

1 

1-7  Years 

0 . 042 

0.042 

2 

8  Years 

0.011 

0.053 

3 

1  Year  High  School 

0 . 198 

0 . 251 

4 

2  Years  High  School 

0.793 

1 . 043 

5 

3-4  years  HS  (no  diploma) 

1 . 503 

2 . 547 

6 

High  School  Diploma 

80.443 

82.99 

7 

1  Year  College 

6.089 

89 . 079 

8 

2  Years  College 

5.828 

94.907 

9 

3-4  Years  College  (no  degree)  2.037 

96.944 

10 

College  Graduate 

2 . 948 

99.829 

11 

Masters  or  Equivalent 

0.1 

99 . 992 

12 

Doctors  or  Equivalent 

0.008 

100.000 

'.yjTWr  Vilr  iirtnr^ri 


Observation  of  Figure  4.9,  or  percentages  in  Table  VIII, 
shows  an  observable  upward  shift  of  education  level  after 
enlistment.  This  is  possible,  and  encouraged  with  official 
continuing  education  and  high  school  completion  programs. 


HIYRED  AND  EDLVL  PERCENTAGES 


Figure  4.9 

n.  NCOE 

The  Noncommissioned  Officer  Education  variable, 
NCOE,  is  a  discrete  and  ordinal  scale  variable.  It  reports 
the  level  of  military  schooling  accomplished  by  the 
individual.  Military  schooling  categories  are  generally 
organized  in  three  ascending  levels:  primary,  basic  and 

advanced.  At  the  two  lower  levels,  primary  and  basic,  there 
are  seperate  courses  for  combat  and  non-combat  CMF's.  In 
some  cases,  there  has  been  an  award  of  an  On-The-Job  Training 
qualification.  The  OJT  award  is  used  to  give  credit  to  an 
NCO  who  can  achieve  technical  competence  in  advance  of  being 
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eligible  for  promotion  to  the  next  higher  paygrade. 

As  previously  mentioned,  attendance  at  military  schools 
is  sometimes  associated  with  an  individual  being  previously 
identified  as  a  superior  performer.  This  is  true  mostly  in 
the  advanced  level  schools  where  selection  for  attendance  is 
through  Department  of  the  Army  Selection  Boards.  At  the 
primary  level,  local  commanders  have  authority  to  establish 
selection  procedures  and  often  will  make  primary  school 
attendance  a  locally  mandatory  requirement  for  junior  NCOs . 
Table  IX  and  Figure  4.10  demonstrate  the  categories  and 
distribution  of  NCOE. 


TABLE  IX  Sample  NCOE  Percentages 


Value 

Cateaorv  Percent 

Cumulative 

Percent 

0 

Nonparticipant 

21 . 19 

21 . 19 

1 

Primary  NCO  Course  (CBT  CMF) 

4 . 46 

25.65 

2 

Primary  Leadership  Graduate 

39 . 36 

65.25 

3 

On-The-Job  Credit  for  E-5  skills 

5.38 

70.63 

4 

Primary  Technical  Course  Graduate 

2 . 82 

73 . 45 

5 

On-The-Job  Credit  for  E-6  skills 

0.0 

73.45 

6 

Basic  Technical  Course  Graduate 

5.11 

78 . 56 

7 

Basic  NCO  Course  (CBT  CMF) 

15 . 99 

94.55 

8 

On-The-Job  Credit  for  E-7  skills 

.01 

94.56 

9 

Advanced  NCO  Course  Selectee 

2.28 

96 . 84 

10 

Advanced  NCO  Course  Graduate 

3 . 06 

99 . 89 

11 

Advanced  NCO  nongraduate,  OJT 

.01 

99 . 9 

12 

On-The-Job  Credit  for  E-8  skills 

.  06 

100 . 00 

SAMPLE  NCOE  SCHOOLING  PERCENTAGES 


NCOE  EDUCATION 
Figure  4 . 10 


o.  POSCR 

POSCR  is  a  report  of  the  Primary  Military 
Occupation  Skill  Qualification  Test  Score  (SOT)  of  the 
individual.  It  is  a  continuous  and  ratio-valued  variable. 
The  SQT  is  a  service  related  test  which  is  used  to  determine 


the  technical  competence  of  a  soldier.  SQT  score  has  been 
used  by  promotion  boards  as  a  qualitative  measure  for 
promotion.  The  numerical  value  represents  the  percent  of 
correct  answers  on  a  written  and  hands-on  evaluation. 
Separate  SQT  tests  are  written  for  each  CMF,  although  the 
structure  of  the  teats  are  similar. 

The  distribution  of  PQSCR,  shown  in  Figure  4.11,  is  more 
dense  in  the  upper  values,  with  an  abnormally  long  left  tail 
extending  to  a  lower  bound  of  21.  An  explanation  for  the 
shape  of  the  PQSCR  distribution  is  an  involved  topic,  and  has 
itself  been  the  subject  of  study.  A  general  observation  is 


that  PQSCR  has  previously  been  used  in  a  manner  where 


individual  soldier  scores  were  often  aggregated  as  a  means  of 
comparison  of  the  parent  unit  of  the  soldiers . I  Ref .  ll:p.  4] 
Thus,  significant  units  and  individual  training  emphasis  has 
been  focused  on  SQT  testing  in  previous  years,  and  pressure 
to  perform  well  was  influenced  by  the  parent  organizations. 
As  a  result,  a  positively  skewed  distribution,  rather  than  a 


normal  distribution,  i3  understandable. 


PQSCR  HISTOGRAM  AND  STATISTICS 


HISTOGRAM  TABLE 


X 

PQSCR 

SELECTION 

ALL 

X  LABEL 

PQSCR 

NO.  OF  ELEMENTS 

37854 

X  MEAN 

78.384 

STD.  DEVIATION 

11.609 

SKEWNESS 

-0.70832 

KURTOSIS 

3.5739 

5-PERCENTILE 

57 

25-PERCENTILE 

71 

MEDIAN 

80 

75-PERCENTILE 

87 

95-PERCENTILE 

95 

X  MIN. 

21 

X  MAX. 

100 

Figure  4.11 

3  .  Summary 

The  fifteen  variables  used  in  this  3tudy  demonstrate 
a  wide  variety  of  characteristics.  All  of  the  dependent 


variable  choices  were  continuous  with  two,  RATE  and  PRA, 
showing  only  slight  departures  from  normality.  The  other 
continuous  variables  did  not  have  identifiable  distributions, 
and  could  not  be  transformed  to  normality  using  power  or  log 
transformations.  Nor  is  it  entirely  clear  that  one  would  need 
to  use  a  transformed  variable  in  subsequent  analysis. 

The  independent  variables  compris  of  a  mixture  of 
continuous  and  discrete  values,  with  both  ordinal  and  ratio 
scales.  Within  the  independent  variables  there  are  two 
principal  sets  of  related  variables.  The  intelligence  test 
scores,  AFQTP,  OAFQTP,  EIMCAT,  and  to  a  lesser  extent  GTSCR, 
are  all  derived  from  the  ASVAB.  These  variables  differ  from 
one  another  in  varying  degrees,  and  are  either  a  re¬ 
expression,  transformation,  or  a  similarly  derived  set  of 
scores . 

The  two  academic  performance  measures,  EDLVL  and  HIYRED, 
are  related,  in  that  EDLVL  is  simply  the  addition  of 
additional  schooling  since  entry  into  the  Army. 

Despite  the  similarities  within  these  two  sets  of 
variables,  it  is  felt  that  sufficient  differences  in 
informational  value  are  present  in  each  expression.  Further, 
since  the  variables  used  are  all  standard  data  collection 
items  for  the  DMDC  database,  each  variable  expression  will  be 
studied.  The  relative  merit  of  any  single  or  combined 
variable  from  this  study  may  be  useful  to  managers  seeking 
appropriate  data  sources  for  other  studies. 


An  important  result  of  the  analysis  of  these  study 
variables  is  the  observation  that  many  of  the  necessary 
assumptions  for  standard  parametric  hypothesis  testing. 
Analysis  Of  Variance  (ANOVA),  and  possibly  regression  will 
not  be  met.  These  include  assumptions  about  the  form  of  the 
distribution  as  well  as  the  scale  of  the  variable.  In  this 
study,  analysis  will  initially  seek  to  use  standard 
parametric  methods.  However,  if  results  of  the  analysis  are 
sensitive  to  distributional  or  scale  assumptions,  those 
assumptions  will  be  checked.  If  examination  of  assumption 
requirements  fails,  or  if  there  is  a  nonparametric  test  of 
similar  efficiency,  nonparametric  tests  will  be  conducted  as 
a  replacement  or  as  a  confirmatory  precedure. 

C.  BIVARIATE  ANALYSIS 

This  section  will  concentrate  on  identifying 
relationships  between  pairs  of  variables,  and  in  identifying 
shifts  in  distribution  as  a  function  of  the  effects,  or 
categorical,  variables.  Three  methods  of  analysis  will  be 
used  in  this  section.  The  first  method  is  analysis  of 
association  using  a  matrix  of  Pearson  product-moment 
correlations.  This  will  provide  intital  information  as  to 
the  strength  of  association  between  any  two  variables,  and 
the  direction  of  that  relationship,  being  either  positively 
or  negatively  correlated.  The  second  method  will  be  analysis 
of  scatterplots  of  pairs  of  variables,  using  the  techniques 
of  LOWESS  and  Jittering  to  better  view  any  trends  in  the 
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variables.  This  method  will  give  initial  information  on  what 
type  of  fitted  line,  and  hence  what  mathematical 
relationship  exists  between  independent  and  dependent 
variables.  Of  significant  interest  will  be  whether  the 
relationship  is  fundamentally  linear,  or  whether  it  is 
possibly  polynomial  or  curvilinear.  The  third  and  final 
method  used  will  be  analysis  of  three-dimensional  empirical 
distribution  plot3.  This  will  demonstrate  some  shifts  in 
distribution  within  several  of  the  effects  variables. 

1 .  Correlation  Matrix 

As  earlier  mentioned,  the  purpose  of  reviewing  the 
Pearson  product-moment  correlation  matrix  is  to  identify 
pairs  of  variables  which  have  a  strong  association.  The 
range  of  the  correlation  coefficient,  rho,  is  from  -1  to  +1, 
and  a  value  of  zero  indicates  that  the  variables  have  no 
linear  association  with  each  other.  A  value  of  +1  indicates 
an  exact  direct  linear  relationship,  while  a  -1  indicates  an 
exact  inverse  linear  relationship.  This  measurement  of 
association  is  not  completely  indicative  of  dependency,  and 
is  only  a  preliminary  tool  to  identify  candidate  variables 
for  testing  and  subsequent  inferential  statistics. 

Remembering  the  central  question  of  this  thesis,  the  most 
important  pairs  of  variables  will  then  be  any  of  the 
intelligence  and  academic  scores  paired  with  the  promotion 
rate  variables.  Of  almost  equal  interest  will  be  any 
interval  scale  effects  variables  demonstrating  a  strong 


[Ref.  13:pp.  251-253]  The  Spearman  method  is  a  distribution 
free  method  providing  correlations  based  on  the  ranks  of  the 
variables.  The  last  column  on  the  second  part  of  Table  X 
lists  the  correlations  computed  using  the  Spearman  method. 
Comparison  of  Spearman  versus  Pearson  values  showed  that 
there  was  an  acceptable  correspondence  between  the  two 
methods,  and  Pearson  values  are  used  exclusively  to  simplify 
analysis . 

Even  with  application  of  both  the  Spearman  and  Pearson 
methods  there  remained  several  pairs  of  variables  which  did 
not  meet  the  assumed  distributional  characteristics  for 
correct  interpretation  of  the  rho  value.  These  variables  are 
the  discrete,  nominal  variables  SEX,  RACETH,  and  possibly 
CMF .  Their  results  are  included  in  Table  X,  but  any 
interpretation  of  the  rho  value  would  be  ineffective.  The 
most  important  rho  values  in  Table  X  are  located  under  the 


TABLE  X 


Pearson  Correlation  Coefficients 


PRATE 

RATE 

PRA 

GTSCR 

AFQTP 

OAFQTP 

EIMCAT 

PQSCR 

PRATE  1 

.  000 

.  822 

.790 

.035 

.  100 

.  177 

.  174 

.039 

RATE 

.  822 

1.000 

.  951 

.118 

.155 

.  209 

.200 

.  101 

PRA 

.790 

.  951 

1 . 000 

.  107 

.  133 

.  177 

.  170 

.094 

GTSCR 

.  035 

.  118 

.  107 

1 . 000 

.741 

.734 

.  689 

.274 

AFQTP 

.  100 

.155 

.133 

.741 

1.000 

.937 

.  903 

.308 

OAFQTP 

.  177 

.  209 

.  177 

.734 

.  937 

1 . 000 

.  955 

.315 

EIMCAT 

.  174 

.200 

.  170 

.689 

.  903 

.  955 

1.000 

.  305 

HIYRED 

.  156 

.  168 

.  177 

.  210 

.215 

.  245 

.209 

.066 

EDLVL 

.085 

.  139 

.  162 

.266 

.257 

.266 

.241 

.  100 

NCOE  - 

.  200 

.047 

.006 

.039 

-.009 

-  .  060 

-.062 

.  093 

SEX 

.013 

-.019 

.  036 

.  055 

.  159 

.050 

.062  - 

.013 

CMF 

.074 

-  .  143 

.  000 

.113 

.106 

.074 

.067  - 

.  042 

RACETH- 

.  064 

-.084 

-  .  057 

-  .  242 

-  .  305 

-  .  325 

-.314  - 

.  128 

PAYGD  - 

.495 

.  000 

.  000 

.  143 

.  087 

.  031 

.023 

.097 

PQSCR 

.  039 

.  101 

.  094 

.274 

.  398 

.  315 

.305  1 

.  000 

PEARSON 

COEFFICIENTS 

CONTINUED 

SPEARMAN 

PAYGD 

HIYRED 

EDLVL 

NCOE 

SEX 

CMF 

RACETH 

PRATE 

PRATE  - 

.  495 

.  157 

.085 

-  .  200 

.013 

-  .  075 

-  .  064 

1 . 000 

RATE  - 

.  000 

.  168 

.139 

.  047 

-  .018 

-  .  142 

-  .  084 

.  808 

PRA 

.  000 

.  178 

.  162 

.005 

.  036 

.  000 

-  .  056 

.777 

GTSCR 

.  143 

.210 

.265 

.039 

.054 

.113 

-  .  242 

.020 

AFQTP 

.  087 

.215 

.258 

-  .009 

.  159 

.  107 

-  .  306 

.075 

OAFQTP 

.031 

.245 

.266 

-  .  060 

.  049 

.  074 

-  .325 

.  165 

EIMCAT 

.023 

.  209 

.242 

-  .  062 

.  063 

.068 

-  .  313 

.  158 

HIYRED 

.  001 

1.000 

.708 

-  .063 

.  131 

.  146 

.  024 

.  147 

EDLVL 

.098 

.708  1 

.000 

.  004 

.  114 

.  177 

.039 

.038 

NCOE 

.433 

-.063 

.004 

1 . 000 

-  .081 

-  .  184 

.015 

-  .  208 

SEX 

.057 

.  131 

.114 

-  .  081 

1 . 000 

.258 

.042 

.  020 

CMF 

.053 

.  146 

.  177 

-  .  184 

.  258 

1 . 000 

.  025 

-  .  069 

RACETH- 

.016 

.024 

.039 

.015 

.  042 

.  025 

1 . 000 

-  .  092 

PAYGD  1 

.000 

.000 

.098 

.  432 

-  .  0E6 

-  .  054 

-  .016 

-  .  535 

PQSCR 

.  097 

.066 

.100 

.  093 

-  .  013 

-  .  042 

-  .  128 

The  most  significant  observations  from  the  tables  are 
summarized  as  follows: 

For  the  variable  RATE  there  is  zero  correlation  with  the 
PAYGD  variable.  Thus,  the  transformation  of  PRATE  to  RATE 
did  remove  the  influence  of  paygrade  on  promotion  rate. 
Similarly,  for  the  variable  PRA,  both  PAYGD  and  CMF  have  zero 
correlation . 

As  expected,  the  three  promotion  rate  variables  are  all 
highly  correlated  in  a  positive  direction. 

With  two  exceptions,  the  correlation  values  for  the 
effects  and  independent  variables  have  similar  magnitudes  and 
signs  across  all  three  expressions  of  promotion  rate.  The 
first  exception  is  the  NCOE  variable.  Under  PRATE  it  is 
negatively  correlated  with  a  value  of  0.2,  and  positively 
correlated  with  lower  values  for  RATE  and  PRA.  This  result 
makes  sense  when  one  considers  that  NCOE  is  highly  correlated 
with  PAYGD,  (0.7.65).  Specifically,  raw  promotion  rates  are 
lower  for  higher  grade  NCO's  due  to  time  in  service  and  time 
in  grade  requirements,  (-.495).  Hence,  NCOE,  which  is  highly 
correlated  with  PAYGD,  will  also  reflect  that  inverse 
relationship.  When  the  influence  of  paygrade  is  eliminated, 
as  it  is  in  RATE  and  PRA,  this  negative  correlation  is 
incidentally  removed. 

The  second  exception  is  for  the  variable  SEX  where  it  is 
positive  signed  for  PRATE  and  PRA,  but  negatively  signed  for 
RATE.  The  magnitude  for  all  three  values  are  close  to  zero. 


RATE  will  be  presented 


in  the 


analysis  of  empirical 


distributions  and  coded  sea tterplots . 

Groups  of  closely  related  variables  have  generally  the 
same  correlation  across  the  three  promotion  variables. 
Specifically,  AFQTP,  OAFQTP,  EIMCAT,  and  to  a  lesser  extent, 
GTSCR,  all  demonstrate  a  strong  positive  correlation  against 
each  other,  and  show  the  same  trend  when  compared  against  the 
promotion  rate  variables.  The  academic  variables  HIYRED  and 
EDLVL  demonstrate  similar  characteristics,  however,  EDLVL  i3 
weaker  than  HIYRED  with  respect  to  the  promotion  rate 
variables . 

Considering  RATE  and  PRA  as  the  better  promotion 
variables  to  model  with,  and  allowing  for  only  one  variable 
from  each  of  the  related  groups,  the  six  most  significant 
correlated  variables  were  selected.  These  variables,  listed 
in  descending  absolute  value  of  rho,  are  shown  in  Table  XI . 


TABLE  XI  Most 

Significant 

Correlated  Variables 

Considering  both 

RATE  and  PRA 

Variable 

Rho  Value 

HIYRED 

approx 

0.17 

OAFQTP 

approx 

0 . 14 

GTSCR 

approx 

0 . 10 

PQSCR 

approx 

0.09 

RACETH 

approx 

-0 . 06 

NCOE 

approx 

0.006 

These  variables,  paired  either  with  RATE  or  PRA,  were 
as  the  starting  basis  for  multivariate  regression 


analysis . 


The  effects  variable  SEX  was  included  for 


subcategory  analysis  in  an  effort  to  detect  any  influence  it 
might  have  on  the  primary  relationships. 

2 .  Paired  Scatter  Plots  and  Simple  Regression 

Plots  of  paired  independent  and  dependent  variables 
were  implemented  to  accomplish  two  purposes.  The  first 
purpose  was  to  visually  search  for  any  dominant  plotting 
patterns.  Since  the  rho  values  found  in  the  previous  section 
are  designed  to  detect  only  linearity,  it  is  quite  possible 
that  nonlinear  relationships  could  exist  between  the 
explanatory  and  dependant  variables.  For  example,  if  the  X-Y 
relationship  was  strictly  Y  =  X2 ,  a  computed  rho  value  should 
be  zero.  Thus,  if  one  relied  only  on  correlation 

coefficients  to  detect  relationships,  he  would  be  misled  into 
thinking  that  no  relationship  existed  between  the  two 
variables.  Simply  plotting  X-Y  scatterplots  of  the 

explanatory  variables  with  the  promotion  variables  did  not 
require  specification  of  the  response  of  the  dependant 
variable.  Visual  observation  could  then  be  relied  upon  to 
detect  dominant  patterns  of  any  form.  These  scatterplots 

used  two  special  procedures,  LOWESS  and  Jittering,  which  will 
be  described  in  analysis  of  Figures  4.12  and  4.13. 

Secondly,  simple  least  squares  regression  was  performed 
for  all  variables  which  had  been  previously  found  to  be 
significantly  correlated.  The  simple  least  squares 

regression  procedure  yielded  a  value  called  the  Coefficient 

6  S 


of  Determination,  or  R2  (R-square).  R2  is  mathematically 
related  to  the  rho,  and  in  the  one  variable  case,  the  square 
of  rho  is  equal  to  R2.  Thus,  R2  can  also  be  used  to 
qualitatively  interpret  the  strength  of  linearity  for  a 
simple  linear  model.  The  advantage  of  producing  R2  values 
was  that  R2  directly  represents  the  proportion  of  variance 
accounted  for  by  the  assumption  of  a  linear  model.  The 
results  for  each  of  the  regressions  and  an  explanation  of  R2 
will  be  discussed  in  analysis  of  Table  XII. 
a.  Paired  Scatterplots 

Since  interpretation  of  the  correlation 
coefficients  assumes  linearity,  visual  analysis  of  pairwise 
scatterplots  was  used  to  search  for  observable  patterns, 
linear  or  otherwise.  This  visual  approach  did  not  require 
interpretation  of  single  derived  parameters  to  identify  any 
patterns . 

In  producing  the  scatterplots  the  LOWESS  procedure  was 
used.  LOWESS,  which  stands  for.  Locally  Weighted  Regression 
Scatter  Plot  Smoothing,  [Ref.  12:pp  94-95]  is  a  nonparametric 
smoothing  procedure  which  is  designed  to  estimate  functional 
relationships  between  Y  and  X.  In  particular,  no  linear  or 
quadratic  relationship  is  assumed.  For  scatterplots  of 

discrete  variables  against  the  continuous  promotion  rate 
variables,  the  discrete  variables  were  Jittered  to  overcome 
repeated  plotting  of  points.  Jittering  involves  generating 
small  random  increments,  which  are  then  added  to  the  X 


values.  As  a  result,  when  the  X-Y  plot  is  performed  fewer  X 
values  are  repeatedly  plotted  in  the  same  location,  and  a 
better  visual  interpretation  can  be  made  of  the  quantity  of  X 
values  at  a  discrete  level. 

The  overall  results  of  the  LOWESS  plots  showed  that  the 
predominant  pattern  was  indeed  linear.  Further,  the  linear 
pattern  was  demonstrated  most  clearly  between  pairs  of  highly 
correlated  variables.  Figures  4.12  and  4.13  demonstrate  that 
linearity  and  the  LOWESS  and  Jittering  techniques 
respectively.  As  a  result,  linear  modelling  techniques  were 
considered  to  be  the  best  choice  for  subsequent  analysis. 


LOWESS  PLOT  OF  ? RA  VS  OAFOTP 
(N=200S) 


LOWESS  SCATTERPLOT  OF  HIYRED  VS  PRA 


OAFQTP 


F igure  4.12 


JITTERED  HIYRED 


Figure  4.13 


b.  Simple  Regression 


For  pairs  of  significantly  correlated  variables, 
simple  least  squares  regression  plot  using  PRA  as  the 


v*T*  »  •  .tl 


independent  variable  was  accomplished.  The  simple  least 


squares  regression  for  pairs  yields  quantitative  results  in 


terms  of  slope  values,  intercept  values,  tests  of  the  slope 


and  intercept  values,  and  the  R2  value. 


The  R2  value  represents  what  proportion  of  total  variance 


was  explained  by  the  simple  linear  model.  As  such,  its 


values  range  from  zero  to  one.  An  R2  value  of  zero  would 


indicate  that  a  linear  model  does  not  account  for  any 


variance  of  the  dependent  values.  Correspondingly,  a  value 


of  zero  would  be  the  estimate  of  the  slope  of  the  line.  The 


significance  of  R2,  like  rho,  is  related  to  sample  size.  To 


determine  the  significance  of  a  R2  value,  the  results  of  the 


T  test  for  the  slope  of  the  model  are  checked.  If  the  T 


statistic  is  large  and  the  probability  of  a  greater  T  value 


small,  a  null  hypothesis  of  a  slope  of  zero  is  strongly 


rejected.  Thus,  we  can  be  confident  of  the  linearity  of  the 


model  and  the  derived  slope  estimate.  Sample  size  13 


considered  in  this  test  because  the  T  statistic  is  computed 


as  a  function  of  sample  size.  Thus,  even  with  a  small  R2 


value,  if  the  T  test  for  the  slope  were  significant,  R. 


value  would  necessarily  be  held  as  sign  1  f  1  .ant .  The  n 1 y 


qualification  for  a  low  R2  value  would  be  that  ‘hero  “v  . 


considerable  'noise'  or  unaccounted  variance  in  the  res^  r.  ; e 


of  the  dependent  variable.  A  summary  of  resul'a  a :  •  *  ■’  ’  w 


Table  XII. 


TABLE  XII  Simple  Lea3t  Squares  Summary  Data 
using  PRA  as  Dependent  Variable 

Variable  Intercept  5 1  d  Err  Slope  5 1  d  Err  R 2 


GTSCR 

-0.056 

(0.0061 

AFQTP 
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AFQTP  having  measurable  R2  values  and  positive  slopes. 

As  expected,  the  results  of  the  simple  regression 
analysis  coincide  with  observations  taken  from  the 
correlation  table. 

When  considered  one  at  a  time,  there  appear  to  be  only  a 
handful  of  variables  demonstrating  a  reportable  relationship 
with  the  promotion  variables.  The  low  R2  value  for  each 
regression  indicates  either  a  large  proportion  of  pure  error, 
or  significant  unexplained  variance  due  to  other  explanatory 
variables  not  being  included. 

3 •  3D  Empirical  Density  Plots 

Three  dimensional  empirical  density  plots  were  used 
to  visually  check  for  distribution  changes  in  the  continuous 
variables  within  the  3ubcategor ies  of  SEX,  PAYGD  and  RACETH. 
Two  such  plot3  will  be  discussed  because  they  depict  visually 
la'-a  ■  h  a  i  a<- ter  is  t  ies  identified  in  earlier  tabular  results. 
These  ;  ha r ac ter  is t ies  were:  the  application  of  AFOT 

:  e  a  *  t  ;  ■  t  i . ;  n  s  by  congressional  mandate  in  1980,  and  t  *  e 
Jifferen'es  in  OAFQT  scores  across  racial  groups . 

The  AFQT  restriction  is  depicted  in  Figure  4.14,  where 
empirical  densities  for  OAFQT  are  plotted  for  each  paygiade. 

b serving  the  three  densities  shows  that  only  the  E  7 
:  a /grade  distribution  contains  scores  less  than  twenty.  This 
make. a  sense.  considering  fhat  all  the  E  7  enlistments  were 
r  i '  •  r  -  ;  9  8  n  .  Another  interesting  observation  from  this 


*■  h  C 


hi  l  oh  ‘'AFQT 


scores 


become  more  dominant  as 


paygrade  increases.  This  is  most  apparent  in  comparing  the 
E-7  density  to  either  the  E-5  or  E-6.  This  shift  in  density 
of  OAFQT  across  the  three  paygrades  suggests  that  attrition 
tends  to  manifest  itself  in  the  lower  AFQT  caetgories,  but 
that  a  low  AFQT  score  is,  in  itself,  not  prohibitive  in 
achieving  senior  enlisted  rank. 

The  second  3-D  empirical  density  plot.  Figure  4.15,  shows 
the  differences  in  renormed  AFQT  scores  across  racial 
subcategories.  A  large  discrepancy  between  the  white  and  the 
distribution  of  black  or  hispanic  races  is  easily  seen, 
although  Indians  have  a  similar  AFQT  to  that  of  whites.  This 
observation  coincides  with  the  occurrence  of  different 
promotion  rates  between  different  racial  categories  as  well. 
However,  to  make  inferences  about  promotion  policy  among 
races  would  require  further  research.  As  pointed  out  by 
Daula,  [Ref.  ll:pp.  7-10]  the  attrition  pattern  among 
different  racial  groups  shifts  the  averages  for  both 
promotion  rate  and  AFQT  among  the  races  over  time.  Since  the 
purpose  of  this  thesis  is  one  of  prediction,  it  is  more 
important  to  identify  the  effect  and  account  for  it  in  the 
model.  An  explanation  as  to  the  cause  of  this  phenomenon 
does  not  appear  to  be  easily  obtained  from  the  thesis  data. 

What  ia  important  about  this  plot  is  that  it  visually 
demonstrates  the  correlation  between  RACETH  and  OAFQT.  If 
OAFQT  is  a  significant  determiner  of  promotion  rate,  then 
RACETH  will  be  an  important  covariate. 


demonstrate 


a 


significant 


data  characteristic. 


that 
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This  corresponds  to  the  zero  value  for  the  PPA-SEX 
correlation  coefficient  also  found  in  Table  X. 

E.  LINEAR  MODELS 

1 .  Analysis  of  Variance 

One  Way  ANOVA  was  used  in  this  thesis  as  an 
intermediate  step  in  defining  a  final  inference  model. 
ANOVA's  usefulness  has  been  as  an  investigative  tool  to 
detect  differences  in  means  among  classes  of  explanatory 
variables.  For  example,  using  PRA  as  the  dependent  variable 
and  EIMCAT  as  the  independent  variable,  One-Way  ANOVA  will 
compare  and  test  the  equality  of  the  average  PRA  score  across 
the  eight  levels  of  EIMCAT,  i.e.,  mental  categories  one 
through  eight.  In  the  testing,  the  null  hypothesis  is  that 
all  eight  mental  category  PRA  means  are  equal,  while  *  he 
*  1  tern  ate  hypothesis  is  that  they  are  not,  The 
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pair  of  categories,  small  discrepancies  between  all  eight 
categories,  or  any  combination  of  difference  conditions. 
Thus,  ANOVA  has  limited  value  in  discerning  the  location  and 
magnitude  of  the  differences  between  category  means,  but  it 
does  identify  if  differences  exist  and  how  strong  those 
differences  are. 

Table  XIII  tabulates  a  twelve  by  three  matrix  of  results 
for  separate  One-Way  ANOVA's.  The  rows  are  the  twelve 
explanatory  variables  and  the  columns  are  the  three  promotion 
variables.  Using  all  three  promotion  measures  as  the 

independent  variable  allowed  for  a  check  of  ANOVA  values  and 
trends  across  those  measures. 

In  addition  to  the  results  of  the  F  test,  a  value  of  R2 
is  reported.  This  R2  value  is  different  than  that  reported 
in  fhe  simple  linear  regression  model.  This  is  because  the 
ANOVA  procedure  considers  the  independent  variable  as  a  set 
of  levels,  rather  than  a  single  continuous  variable.  With 
°>ne  Way  ANOVA,  all  variables  had  some  level  of  R2  reported. 
Further,  because  of  the  increased  informational  value  of 
var  i able  categories,  and  hence,  more  degrees  of  freedom  for 
imputation,  the  values  of  R2  increased  above  the  simple 
i egression  reported  values. 
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resources  used  could  handle  all  the  integer  values  for  the 


score  ranges  of  AFQTP  and  the  other  continuous  variables,  it 
was  possible  to  gain  insight  into  the  existence  of 
differences  between  individual  score  cells. 

Additionally,  nonparametr ic  procedures  were  used  to 
evaluate  the  relationships.  [Ref.  13:pp.  250-255]  The 
nonparametr ic  ANOVAs  utilized  the  ranks  of  the  variables  and 
also  yielded  the  F  statistic  for  testing  the  hypothesis  of 
equal  level  means.  Having  agreement  between  the  parametric 
and  nonparametric  values  removed  the  need  of  having  to  pursue 
confirmation  of  assumptions  for  parametric  ANOVA.  It  will 
also  allow  analysis  of  results  to  focus  on  the  resultant 
values  of  F  and  R2  tabulated  in  Table  XIII. 


TABLE  XIII 

One-Way 

Anova  Summary 

Variable 

PRATE 

RATE 

PRA 

F 

R2 

F 

R2 

F 

R2 

SEX1 

5 . 9 

.00016 

13.3 

. 00351 

48.4 

. 00128 

CMF* 

35. 

.02788 

93 . 3 

. 07415 

0.0 

. 00000 

RACETH 

90  . 

.01177 

165.0 

. 02133 

80.0 

. 01049 

PAYGD* 

6292  . 

.24953 

0.0 

. 00000 

0.0 

.00000 

GTSCR 

18  . 

.04250 

13 . 4 

. 03184 

10.9 

. 02636 

AFQTP 

32. 

. 07046 

20.6 

.04623 

17.3 

.  03908 

OAFQTP 

36. 

.08441 

25 . 3 

. 06101 

19. 

. 04657 

E I MC AT 

37. 

.01076 

71 . 5 

. 02035 

96 . 9 

. 02739 

HIYRED 

96. 

.02950 

106 . 0 

.03272 

117. 

.03590 

EDLVL 

37. 

.01076 

71 . 5 

.02035 

96 . 9 

. 02739 

NCOE 

156  . 

.05097 

76.4 

. 02499 

46.8 

.01583 

PQSCR 

1 . 9 

.00375 

6.6 

.01341 

5.8 

.01181 

*  The 

Pr>F 

( level  of 

rejection  of  the 

null  hypothesis 

of  no  difference  in  means)  was 

.0145  for 

PRATE, 

.0003  f< 

RATE  and  .0001  for  PRA. 

2The  Pr>F  for  PRA  is  1.0. 

i The  Pr>F  Cor  RATE  is  1.0,  and  for  PRA  is  1.0. 

Values  of  Pr>F  for  the  remainder  of  the  table  were  .0001. 


Review  of  the  Table  XIII  demonstrates  some  anticipated 


results,  which  are  summarized  in  the  following  paragraphs. 

Since  the  variables  PAYGD  and  CMF  were  controlled  for  in 
the  derivation  of  PRA,  there  is  correspondingly  no 
relationship  between  those  variables  and  the  PRA  promotion 
variable.  Likewise,  the  variable  PAYGD  was  controlled  for  in 
the  derivation  of  RATE,  and  there  was  no  linear  relationship 
demonstrated  for  that  pair.  The  zero  values  for  the  F 
statistic  and  R2  for  those  variable  combinations  documents 
this  fact. 

Using  RATE  or  PRA  as  the  dependent  variable,  and  allowing 
for  only  one,  most  significant  variable  to  be  selected  from 
each  of  the  intelligence  and  academic  groups,  results  in  the 
same  set  of  explanatory  variables  as  were  found  in 
correlation  analysis.  These  variables  were:  HIYRED,  OAFOTP, 
GTSCR,  PQSCR ,  RACETH,  NCOE,  and  SEX.  The  most  significant 
variables  were  the  ones  which  had  the  larger  F  statistic,  and 
R2  value.  This  set  is  not  ordered,  however,  since  there  are 
differences  in  order  between  the  PRA  and  RATE  models. 

Another  interesting  development  from  ANOVA  results  when 
the  explanatory  variable  mean  and  variance  for  each  level  are 
plotted  against  the  promotion  variable.  This  not  a  standard 
analytical  plot,  but  it  does  provide  some  visual  information 
on.  the  size,  direction,  and  dispersion  about  the  center  line 
of  an  independent  discrete  variable.  This  plot  is  most 
similar  to  a  strip  box  plot  for  continuous  variables. 
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An  example  plot  where  each  individual's  PRA  score  was 
plotted  against  the  sum  of  his  EIMCAT  and  HIYRED  score  is 


shown  in  Figure  4.17.  In  Figure  4.17  the  two  center  lines 
plotted  represent  the  sum  of  scores  for  EIMCAT  and  HIYRED 
seperated  between  the  GED  qualified  personnel  and  High  School 
Diploma  Qualified  personnel.  The  outside  two  lines  trace  the 
upper  and  lower  bounds  one  standard  deviation  from  the 
computed  means. 

X-Y  PLOT  OF  WEANS  AND  VARIANCES 


Figure  4.17 

By  plotting  a  separate  line  for  each  high  school  diploma 
category  it  can  be  seen  that  while  both  groups  have  a  similar 
increase  in  promotion  rate,  as  the  combined  level  of  EIMCAT 
and  HIYRED  increased,  the  GED  qualified  personnel  were 
consistently  a  fixed  level  lower  than  a  fully  qualified  high 
school  graduate.  Thus,  the  additional  merit  of  an  actual 
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high  school  diploma  did  manifest  itself  in  promotion  rate. 

A  final  look  at  ANOVA  involves  specifying  a  model  using 
the  set  of  the  seven  most  significant  independent  variables, 
and  then  checking  for  interactions  among  them.  Table  XIV 
gives  the  results  of  the  Seven-Way  ANOVA  using  this  model: 

RATE  =  7  Main  Effects  +  Two  Way  Interactions 

Table  XIV  depicts  the  seven  most  significant  variables 
individually  in  the  Main  Effects  rows,  and  the  interaction 
terms  in  the  Interactions  rows. 

The  advantage  of  this  Seven-Way  ANOVA  is  that  inclusion 
of  all  of  the  explanatory  variables  simultaneously  allows  for 
comparison  of  the  significance  of  each  of  the  explanatory 
variables  relative  to  the  others.  Additionally,  specifying 
combinations  of  two-way  interactions  checks  to  see  if  any  two 
of  the  explanatory  variables  are  significantly  related  to  one 
another.  An  example  of  an  interaction  would  be  a  SEX  and  CMF 
term.  As  has  been  previously  shown,  female  personnel  tend  to 
be  associated  with  higher  CMF  values.  If  the  ANOVA  model  for 
promotion  included  a  term  which  was  the  product  of  the  two 
values,  SEX* CMF,  then  the  two  attributes  would  be  jointly 
considered  in  the  ANOVA  model.  If  the  interaction  term  was 
found  to  be  significant,  then  the  two  individual  variables 
entries  for  CMF  and  SEX  would  be  removed  and  only  the 
interaction  term  retained. 

An  additional  consideration  in  the  Seven  Way  ANOVA  was 


that  the  model  was  unbalanced.  Unbalanced  means  that  there 
were  some  combinations  of  the  factor  levels  which  did  not 
have  any  entries  in  the  ANOVA  cells.  An  example  of  this  can 
be  seen  in  the  SEX*OAFQT  term.  Specifically,  there  are  only 
76  degrees  of  freedom  for  the  interaction  term,  while  the 
individual  degrees  of  freedom  for  SEX  and  OAFQT  are  1  and  79 
respectively.  Thus,  the  SEX*OAFQT  term  had  three 
combinations  without  entries.  As  a  result,  the  F  statistic 
computed  will  be  only  approximate.  Since  the  purpose  of  this 
step  in  analysis  was  exploratory,  the  F  statistic  estimates 
were  considered  adequate. 

Table  XIV  presents  the  results  of  a  Seven  Way  ANOVA  using 
RATE  as  the  dependant  variable.  Similar  results  were 
obtained  using  PRA  as  the  dependant  variable. 


TABLE  XIV  7-Way  Analysis  of  Variance  with  Interaction 


DEPENDENT  VARIABLE:  RATE 


SOURCE 


MEAN  SQUARE  F  VALUE 


PR  >  F 


MODEL  14966  18869.39 

1 . 260818 

1.52  0.0001 

0.491 

ERROR  22887 

18981 .65 

0.829364 

CORRECTED 

ROOT  MSE 

TOTAL  37853 

37851 . 04 

0.91069421 

SOURCE 

DF 

ANOVA  SS 

F  VALUE 

PR  >  F 

Main  Effects 

RACETH 

5 

807 . 35 

194 . 69 

0.0001 

SEX 

1 

13.28 

16 . 02 

0.0001 

OAFQT 

79 

1670 . 54 

25 . 50 

0 . 0001 

HI  YRED 

12 

1238 . 25 

124.42 

0 . 0001 

1  GTSCR 

93 

1205 . 22 

15.63 

0.0001 

I  NCOE 

13 

945 . 89 

87.73 

0 . 0001 

j  PQSCR 

78 

507 . 52 

7.85 

0 . 0001 

!  I  nteractions 

1  RACETH  *SEX 

5 

0 . 00 

0 . 00 

1 . 0000 

!  SEX  *  OAFQT 

76 

440 . 59 

6 . 99 

0 . 0001 

1  S  EX  *  H I YRED 

9 

65 . 03 

8 . 85 

0 . 0001 

;  SEX  *GTSCR 

72 

72 . 80 

1.22 

0 . 0999 

1  S  EX  *  NCOE 

11 

57.76 

6 . 33 

0.0001 

1  SEX  *  PQSCR 

70 

53.06 

0.91 

0 . 6795 

'  RACETH*OAFQT 

335 

0 . 00 

0.00 

1 . 0000 

RACETH  *  H I YRED 

46 

107 . 84 

2 . 83 

0 . 0001 

'  RArETH  *GTSCR 

326 

0 . 00 

0.00 

1 . 0000 

RACETH  *NCOE 

46 

8 . 41 

0 . 22 

1 . 0000 

1  PA'  ETH  *  PQSCR 

288 

104 . 24 

0.44 

1 . 0000 

1  ? A  EOT  *  H I YRED 

593 

112.62 

0 . 23 

1 . 0000 

'  OAFQT  »  GTSCR 

2864 

2418 . 55 

1 . 02 

0 . 2570 

OAFQT  *  NCOE 

614 

954 . 24 

1 . 87 

0.0001 

1  OAFQT  *  PQSCR 

3631 

3182.33 

1 . 06 

0.0137 

1  H I YRED  *GTSCR 

564 

130.88 

0.28 

1 . 0000 

HI  YRED*NCOE 

88 

276 . 98 

3 . 80 

0 . 0001 

H  I  YRED*PQSCR 

518 

484 . 13 

1  .  13 

0 . 0251 

GTSCR*NCOE 

604 

718.86 

1 . 44 

0 . 0001 

GTSCR *PQSCR 

3383 

2997.93 

1 . 07 

0 . 0051 

NGOE*  PQSCR 

542 

504 . 44 

1  .  12 

0.0268 

important  observations  can  be  obtained  from  Table 
first  observation  is  that  there  are  few  significant 
i  ♦erms.  Only  those  terms  marked  with  an  asterisk 


demonstrated  statistical  sigRifiam  e  ki"  •  •  -  -  - 

level  .0001.  Of  these,  only  ♦htee  ha  !  F  v  a  „  ^  *  •  . 

3.8.  These  interaction  terms  were  AFiTP,  w  E '  •  ■  *•  1 

all  interacting  with  SEX.  The  presence  ;  .  o  •  ,  •  .  - 

the  Seven-Way  ANOVA  model  was  p  r  *»v  i  jjly  •  t  s e :  e  :  ’  r  ~ 

correlation  matrix.  Table  X,  where  SEX  w  *3  , .  .  a  . 

correlated  with  HIYRED  and  OAFQTP,  '  0  . 1  .  « '  : 

respectively),  and  negatively  correlated  wi*n  Nr  F .  h. 

The  implication  of  having  significant  interaction  >113  .a 
that  they  would  need  to  be  included  in  any  predi'  * :v*  mode. 
Thus,  identification  of  interactions  jsing  ANf  VA  *  *  n 
critical . 

Secondly,  all  the  main  effects  variables  cor.*  .  >  ,e  ♦  o  re 
significant,  even  when  used  simultaneously  by  the  rr  .de. 

Lastly,  selecting  the  single  most  significant  exp.ar.a*  ;  , 
variable  from  the  academic  and  education  groups  yields  ’  r.e 
same  unordered  best  set  as  did  the  One  Way  ANOVA:  AFwTl, 

HIYRED,  GTSCR ,  NCOE,  RACETH,  and  SEX. 

In  summary,  the  fundamental  result  of  ANOVA  was 
confirmation  that  there  are  differences  ir.  the  level  m*=>ar.s  f 
promotion  scores  due  to  several  independent  expldr.it  y 
variables,  and  ar  agreement  as  to  which  were  the  best 
explanatory  variables  when  considered  separately  r 

simultaneously. 

Also,  plotting  the  means  and  variances  of  the  sum  of 
EIMCAT  and  HIYRED  versus  PRA  demonstrated  that  there  was  a 
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3  e  •  t  .  ■  r,  was  .mi  l.y  *  -  :  jrif  li  *  the  existence  of 


.  j  •  -  f.  a  r.  ♦  l  .  f  f  e !  er ,  •  es  jin'jfig  ’he  levels  of  the  independent 


iieyor.d  4  x  now  1  edy  l  ny  that  there  are  some 


leper  Jet.  *  j  :  .  a  t  .  e  a  available  to  explain  promotion  rates. 


/er,  Way  AN'  VA  i.  i  r.ot  provide  any  numerical  measure  of  the 


-  ’  :  ,  *  . >  f  ,t  t  t  ’he  contribution  of  a  given  independent 


■■  a  :  .at. 


he  model.  IRef.  14:p.  101  In  addition,  in 


a'a./a.i:  '  t  ’  r.e  .oh’ir.uoua  variables,  the  nature  of  the 


at  .  a  t.  .  e  was  hanged  to  lepresent  a  discrete  valued  variable. 


.rporating  continuous  variables  into  ANOVA  was 


■>  t.eved  tnrouyh  the  intermediate  method  of  ANCOVA.  ANCOVA 


res  metric  continuous  variables  as  well  as  nonmetnc 


j  .  t.ativ*  values.  The  result  of  ANCOVA  was  an  improved 


mi  a  ,  1 4  v  a  r  1  a  t  •  model  m’h  the  inclusion  of  continuous  variables 


.  r.  their  proper  form.  ANCOVA  provided  estimates  of  the 


..r,  ear  coefficients  for  the  continuous  variables,  and 


reported  on  the  proportion  of  variance  accounted  for  by  each 


categorical  variable  as  well.  These  results  provided  the 
basis  for  further  removal  of  variables  or  interactions  from 


the  set  previously  identified.  [Ref.  15:  pp .  343-349] 

The  model  considered  was  based  on  the  results  of  the 
previous  chapters  and  consisted  of  the  following  form: 

Promotion  =  f ( OAFQTP , PQSCR , GTSCR , HI YRED , NCOE, RACETH , SEX 
plus  interaction  terms  SEX*HIYRED,  SEX*GTSCR,  SEX*OAFQTP) 
The  variables  OAFQT,  PQSCR,  and  GTSCR  are  metric  und 
continuous,  HIYRED  and  NCOE  are  discrete  and  metric,  and 
RACETH  and  SEX  are  discrete  and  nonmetric. 

A  representation  of  the  model  using  notation  consisted  of 
the  following  form: 

Yi  -  B«  +  B1X1  ♦  B  i  X  i  +  BiXs  +  D1  +  Da  ...  D«  +  Ii  ...  I* 

In  the  above  notation,  Yi  is  the  promotion  variable  PRA, 
Bo  is  the  linear  intercept,  and  Bi  through  B*  are 
coefficients  for  the  continuous  variables  OAFQT,  GTSCR  and 
PQSCR.  The  coefficients  B»  through  B*  are  assumed  to  be  the 
same  for  all  levels  of  the  other  variables.  Di  through  D« 
represent  the  discrete  variables  RACETH,  SEX,  HIYRED,  and 
NCOE.  Ii  through  Ij  are  the  interaction  terms  OAFQT*SEX, 
HI YRED*SEX,  and  NCOE*SEX. 

This  model  is  also  unbalanced  and  the  F  statistics  are 
estimates.  The  results  of  the  ANCOVA  using  this  model  are 
shown  in  Table  XV. 
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TABLE  XV  ANCOVA  with  Interactions 


DEPENDENT  VARIABLE:  PRA 


SOURCE  DF 

SSQ 

MEAN  SQUARE  F  VALUE 

PR  > 

F 

R2 

MODEL  55 

2423. 

68 

44.07  47.13 

0 . 0001 

0 . 0642 

ERROR  37798 

35339 

.29 

0 . 934 

ROOT 

MSE 

C0RR  37853 

37762 

.98 

0.966 

TOTAL 

SOURCE 

DF 

TYPE  III  SS 

F  VALUE 

PR  >  F 

Main  Effects 

OAFQT 

1 

12 . 89440024 

13.79 

0 . 0002 

RACETH 

5 

152 . 10095609 

32.54 

0.0001 

SEX 

1 

5.31950192 

5.69 

0.0171 

HIYRED 

12 

517.91751116 

46 . 16 

0.0001 

GTSCR 

1 

3.65772995 

3.91 

0.0479 

NCOE 

13 

132 . 83314221 

10.93 

0.0001 

PQSCR 

1 

80 . 15632971 

85.73 

0.0001 

Interactions 

0AFQT*SEX 

1 

4.03387863 

4.31 

0.0378 

SEX*HI YRED 

9 

10.16825209 

1.21 

0.2844 

SEX*NC0E 

11 

18.42527136 

1 .79 

0 . 0496 

T  FOR  HO:  PR  >  1 

T 1  STD 

ERROR  OF 

PARAMETER  ESTIMATE  PARAMETERS  ESTIMATE 

INTERCEPT  0.25501  0.31  0.7592  0.83191986 

OAFQT  0.00094  1.26  0.2077  0.00074544 

GTSCR  -0.00104897  -1.98  0.0479  0.00053034 

PQSCR  0.00422902  9.26  0.0001  0.00045674 


There  are  three  important  observations  from  Table  XV. 
First,  the  main  effects  variables,  with  the  exception  of 
GTSCR,  are  still  significant  in  their  ability  to  account  for 
variance  in  the  model. 

Secondly,  no  interaction  terms  are  significant.  The  PR  > 
F  for  these  terms  are  much  greater  than  .0001  and  each  has  a 
small  F  value.  Thus,  the  effect  of  the  interaction  terms 
will  be  assumed  to  be  negligable. 

Lastly,  the  bottom  portion  of  the  ANCOVA  table  lists 
estimates  of  regression  coefficients  for  the  continuous 


were:  OAFQT ,  PQS<'R.  HIYPFO.  N  '  F  .  RA<‘FTH  .  *r,  1  ,FK  Th~ 

resu  Iri  were  considered  sat  isf^ctory,  in  f  h  a  *  *  h  r  *»  m  a  ,  n  :  • ,  4 

variable  eet  contains  single  measures  vf  a  adeitii  a  p  ’  1  *  ile. 
education,  professional  »  lui  P  1  >n,  military  per  f  -  t  m  a  n  m 
testing,  as  well  aa  two  :  a  t  egu  r  1  o  a  1  variables:  f>  K  X  and 
PAf'ETM 

3  .  The  Final  Model:  A  Multiple  Rqqi  ess  ion  1  ANCPV  A  ’ 

1 .  Background 

Regression  analysis  with  a  reduced  set  of 
variables  was  the  final  step  in  successive  data  analyses. 
The  important  result  of  this  analysis  was  a  set  of 
coefficient  values  which  estimated  qualitative  numerical 
statements  about  the  independent  influence  of  each  of  the 
explanatory  variables.  Of  specific  importance  was  the 

independent  influence  of  OAFQT  and  HIYRED  in  predicting  iri 
individual  promotion  rate. 

In  the  development  of  the  regression  model  this  section 
will: 

1.  Review  the  pertinent  results  which  led  to  the 
regression  model  definition. 

2.  Compare  the  model  using  the  three  promotion  rate 

variables . 

3.  Select  a  single  promotion  variable  for  the  model. 

4.  Interpret  the  resulting  regression  estimates  and 
conduct  sensitivity  analysis. 

5.  Check  model  assumptions  and  confirm  the  model  using 
an  alternate  data  set  and  nonpa rametr ic  procedures. 

6.  Test  the  model  by  comparing  actual  versus  predicted 
promotion  rates  for  population  subcategories. 
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iJi  *v  i  "ua  t  'i  i  *■  9  ti  **  nm  1  .  i  •  *i*»  f  w  .  »  j  j  «  i  *  j  •  « ,  • 

A  NO  V  A  and  A  N  • '  1 V  A  1  *»  m  ■  >  ri  a  ♦  t  a  ’  *»  d  ’  (u*  x  ;  j  ■  .  f  i  ■  • 

1 1  f  f  crancta  axm'  bt>'  w«**n  I'l^iru;  *  f  •  »ip  **v;  .  .»»  ,•  •  , 

vat  nblti  ai  a  f  * n '•  *  l  > n  >  f  <v«i  <  jp  (  t  m  i *  l  ■  r>  (•'<»* 

t'aitP'l  :*  > '  a  *■  f  <=*  r  p  1  >  ♦  m  i  f  i  .  .  ?  l  n  ■  j  »  m  ■  Mi  i  •,  ;  ♦  m  hr.  ;  )  ,m;<  *  ■  : 

pi  ■  *  s  -f  *  he  level  means  (  ■  j  ri  ■  1  in  AN<>VA,  r,  a  i  i'  -n'  .y 

lemons'  rated  an  as-ending  i  meat  |ij"etn  when  p  l  •»»*».  J  a  g  a  i  r  -<  • 
pt  omo'  ion  va  r  l ah 1 eg  . 

AN'WA  and  A  N<' '  >  V  A  modpls,  us  t  nq  interact  i  ms,  i  esu  i  ’  e  J  in 

»  he  eiimina'  inn  of  variables  which  did  no'  demonstrate 

sufficient  linear  additive  effect  to  be  included  in  'fie 
model.  Further ,  this  analysis  confirmed  that  theie  was  no 
significant  interaction  among  the  remaining  variables. 
Correlation  analysis,  combined  with  the  in  -  depth  univai  rate 
analysis  as  to  the  nature  and  scoring  procedures  of  the 
individual  variables,  identified  groups  of  variables.  In 
subsequent  analysis,  these  groups  were  then  restricted  to 
allow  for  only  the  strongest  unique  variable  to  be  entered 
into  the  model . 

The  final  set  of  variables  for  entry  into  the  model  are 
the  following: 

Promotion  =  f ( OAFQT , PQSCR , H I YRED , NCOE , RACETH , SEX ) 

This  model  is  a  mixed  scale  and  variable  type  model, 
including  both  discrete  and  continuous  variables.  Two  of  the 
input  variables  have  nominal  scale,  RACETH  and  SEX.  To  allow 
for  their  entry  into  the  model,  these  values  were  transformed 
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In  'he  above  notation,  Y,  13  one  of  the  promotion 
variables,  B.  is  the  linear  intercept,  and  Bi  and  B»  are 
-oef f 1 o 1 en t s  for  the  continuous  variables  0AF9T,  and  PQSCR . 
Pi  and  B«  are  :oeff lcienta  for  the  discrete  and  ordinal 
var lables  HIYRED  and  NCOE.  Di  through  D>  represent  the  dummy 
variables  for  RACETH,  and  D»  represents  the  dummy  variable 
fnr  SEX. 

The  data  set  of  37,854  records  was  randomly  split  into 
two  separate  data  files  for  regression  analysis.  This 
provided  for  a  different  data  set  to  confirm  analysis  of 
regression  coefficients  from  the  first  set.  Paragraph  e.l. 
of  this  section  compares  resulting  regression  coefficients  of 
the  model  using  the  second  data  set. 
b.  Results 

Table  XVI  lists  the  regression  results  of  the 


basic  model  variables.  When  computing  models  for  PRATE  and 
RATE  the  effects  variables  CMF  and  then  CMF  and  PAYGD  were 
reintroduced  into  the  set  of  explanatory  variables 
respectively.  This  allowed  for  comparison  of  variable 
coefficients  and  R2  value  changes  as  the  dependent  variable 
became  more  restricted.  In  Table  XVI  the  top  paragraph  shows 
the  ANOVA  results  of  the  model  and  reports  the  F  and  R2 
statistic.  Each  column  then  gives  the  regression  results  of 
each  promotion  rate  model,  including  a  Pr>T  value  as  measure 
of  the  strength  of  rejection  for  a  null  hypothesis  of  zero 
for  the  estimate  value.  Values  of  Pr>T  less  than  .05  are 
considered  acceptable  for  consideration  of  that  variable. 


TABLE  XVI  Regression  Results 


PRATE 

RATE 

PRA 

Added  Variables 

CMF,  PAYGD 

CMF 

None 

ANOVA  F 

1317.4 

360.3 

218.5 

Pr>F 

.0001 

.0001 

.0001 

R2 

.3116 

.0948 

.  0546 

Intercept 

0 . 022222 

-1.03692 

-1 . 28822 

Cstd  error) 

( .002558) 

(  . 055368  ) 

( .05600) 

Pr>T 

.0001 

.  0001 

.  0001 

OAFQT 

.0001355 

. 0058817 

.0042608 

(std  error) 

(00000871) 

( .0002444) 

( . 0002492 ) 

Pr  >T 

.  0001 

.  0001 

.  0001 

HIYRED 

.0005341 

. 148352 

. 139484 

(std  error) 

(  .000152) 

( .004851) 

( .0049298) 

Pr  >T 

.0001 

.  0001 

.  0001 

PQSCR 

. 000089 

. 001608 

. 00327211 

(std  error) 

( .000014) 

( . 000449) 

( .0004583) 

Pr  >T 

.0001 

.  0001 

.0001 

SEX 

- .0008582 

.022904 

. 0564079 

(std  error) 

(  .00050325) 

( .01562) 

( .0155310) 

Pr>T 

.  088* 

. 1427* 

.  0003 

NCOE 

. 00008839 

.012688 

.  0073740 

(std  error) 

(  .  00000625) 

( .0017808) 

( . 0017949) 

Pr  >T 

. 1573* 

.0001 

.0001 

D1  (RACETH) 

.0026347 

.053088 

.01497054 

(std  error) 

( .0011286) 

( . 035653) 

(  .0363905) 

Pr  >T 

.0196 

. 1365* 

.  6808* 

D2  (RACETH) 

-  .0037888 

- .096320 

-0 . 0898693 

(std  error) 

( .0011266) 

( .035570) 

(  .  0363089) 

Pr>T 

.  0008 

.  0068 

.0013 

D3  (RACETH) 

-  .0009404 

- .0239592 

- . 0417668 

(std  error) 

( .001279) 

( . 040383) 

(  .  04122033: 

Pr  >T 

.4623* 

.5530* 

. 3109* 

D4  (RACETH) 

. 00028892 

.089059 

.01007473 

(std  error) 

( .0032534) 

( . 102707) 

( . 1048355) 

Pr>T 

. 3745* 

.3859* 

. 9234* 

D5  (RACETH) 

- .000224 

- .021530 

- .0138649 

(std  error) 

( . 0018127) 

( . 0572261 ) 

( .058409) 

Pr  >T 

.  9016* 

.7067* 

.8124* 

CMF 

- . 000147 

- . 0053672 

NA 

(std  error) 

( . 0000052 ) 

( . 0001654 ) 

Pr>T 

.  0001 

.  0001 

D7  ( PAYGD ) 

. 060127 

NA 

NA 

(Std  error) 

( . 0017904) 

Pr>T 

.0001 

DS  (PAYGD) 

.017999 

NA 

NA 

(std  error) 

( .001774) 

Pr>T 

.  0001 

Observations  from  the  regression  table  are  summarized  in 
the  following  paragraphs. 

The  input  variables  OAFQT,  HIYRED,  and  PQSCR  all 
maintained  a  positive  and  statistically  significant 
coefficient  value  across  all  three  dependent  variables. 

The  inclusion  of  PAYGD  with  the  PRATE  variable 
significantly  increased  the  R2  value  of  the  model. 
Conversely,  the  influence  of  OAFQT,  HIYRED,  PQSCR,  and  the 
other  explanatory  variables  was  severely  diminished. 

The  RATE  model  is  very  similar  to  the  PRA  model,  and  ha3 
generally  larger  estimate  values  and  a  higher  R2.  However, 
the  estimates  for  RACETH  and  SEX  did  not  have  significant  T 
values  . 

The  PRA  model,  although  having  a  lower  R2  value  and 
generally  smaller  estimate  values,  had  an  acceptable  T  test 
result  for  SEX.  Additionally,  the  PRA  model  contained  one 
less  nominal  explanatory  variable,  CMF.  The  PRA  model  then, 
has  fewer,  and  more  reliable  nominal  explanatory  variables. 
Since  the  objective  of  the  study  was  to  focus  on  academic  and 
educational  measures  as  predictors  of  promotion,  the  PRA 


model  was  chosen  as  the  most  effective  predictive  model . 
Subsequent  analysis  of  regression  coefficient  results  were 
conducted  with  the  PRA  model. 


$ 
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c.  Interpretation 

Interpretation  of  the  regression  coefficients 
will  include  two  points.  First,  the  explanatory  variables 
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which  can  effect  the  greatest  change  in  the  dependent 
variable  will  be  identified.  Secondly,  an  example  will 
demonstrate  the  amount  of  change  in  a  given  explanatory 
variable  required  to  achieve  a  five  percent  shift  in  the  PRA 
estimate . 

The  amount  of  change  in  PRA  caused  by  a  change  of  one  unit 
of  an  explanatory  variable  can  be  read  directly  from  the 
regression  coefficients.  However,  the  total  amount  of  change 
that  an  explanatory  variable  can  cause  in  PRA  depends  on  the 
range  of  the  explanatory  variable.  Table  XVII  gives  an 
ordered  listing  of  the  explanatory  variables,  excluding 
categorical  variables,  from  most  to  least  total  influence  as 
measured  by  Net  Possible  Change.  The  net  possible  change  is 
simply  the  number  of  units  in  the  range  of  the  explanatory 
variable  multiplied  by  the  coefficient  estimate. 


TABLE  XVII  Net  Possible  Change  by  Explanatory  Variable 
Variable  Ranae  Estimate  Net  Possible  Change 


HIYRED 

1-12 

. 13948378 

1 . 6738 

OAFQT 

1-99 

. 00426083 

0.4218 

PQSCR 

21-100 

. 00327212 

0 . 2585 

NCOE 

0-14 

. 00737408 

0. 1106 
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using  the  normal  app:  •  x :  t  i  *  .  1  .  •  *  •  *  -  .•  * 

of  the  PR  A  distnbut  :  n  .  An  wj:  i  f  .  ►  ■  *  - 

then  require  the  PRA  vjl  •  „  1  ;  4'  *  h '  .  ••  • 

Using  the  3  t  a  n  i  a  r  d  normal  *  a  1:  e.-  *  ■» ;  .  :  v  .  ->  ■  •  »•  *  •  .  '•  *• 

distribution,  the  FRA  v  a  .  e  :  :  e  3  -t  n  i  .  ■  ; 

percentile  was  0.1434.  Check  ;  r.  4  ♦  r-.e  ser  .  - 

explanatory  variable  ■■t.s  iste.i  f  •  h  a :  •  1  .  ;  . 

e x p  1  a n a  t o r  y  variable  a  sufficient  number  of  . :  .  •  s  *  . 
in  a  PRA  value  of  0.1434,  while  holding  all  ,-t  her  explanatory 
variables  at  the  population  averagp .  Table  XVII!  '  u  dates 
the  increase  of  explanatory  variable  units  necessary  ♦  o 
produce  a  5  percent  upward  shift  in  PRA  percentile. 

Alternatively,  if  the  amount  required  to  reach  the  1*  S  . 
percentile  was  not  possible  within  the  range  of  the  input 
variable,  the  maximum  amount  of  available  change  was  listed. 


TABLE  XVIII 

Sensitivity  of 

PRA  to  Explanatory  Variables 

Variable 

Averaae  Value 

Chanqe  to 

Pra  %  Chanqe 

HIYRED 

6.01 

7 . 0 

55 . 9 

OAFOT 

45 . 3 

74 . 0 

55.7 

NCOE 

3 . 06 

14.0* 

54 . 0 

PQSCR 

78 . 4 

99 .0* 

53 . 4 

•max  value 

Interpretation  of  the  coefficient  values  clearly 
demonstates  that  HIYRED  is  the  most  important  explanatory 


variable . 


This  observation  is  understandable  since  the 


structure  of  the  variable  is  discrete,  and  that  changes  to 
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.i  f  'i.:ent  v.ilues  represents  major  distinctions  in  educational 
:  irk jround .  The  example  of  shifting  from  a  value  of  six  to  a 
vdue  of  seven,  represents  the  difference  of  having  a  high 
-hool  degree  versus  having  gone  to  one  year  of  college.  In 
percentages  of  HIYRED,  that  constitutes  moving  from  a  large 
■'enter  group  of  high  school  qualified  NCO's,  to  the  upper 
ninety  percent  of  the  HIYRED  distribution. 

OAFQT  is  the  second  most  significant  explanatory  variable. 
A  shift  of  roughly  one  quarter  of  its  range,  i.e.  45  to  75, 
can  change  PRA  plus  or  minus  five  percent.  The  other 
explanatory  variables  NCOE  and  PQSCR  have  considerably  leas 
influence  on  t.he  dependent  variable. 

d.  Checking  of  Assumptions 

To  verify  the  requirements  for  the  regression 
model,  residual  analylsis  was  performed  using  the  Grafstat 
program.  Representative  plots  of  the  OAFQT  residual  are 

shown  in  Figures  4.18  and  4.19. 

REGRESSION  REDISUAL  HISTOGRAM  REGRESSION  RESIDUAL  SCATTER  PLOT 
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The  histogram  of  residuals,  shown  in  Figure  4.18, 
demonstrates  that  the  residual  distribution  is  approximately 
normal.  Homoscedasticity  is  checked  in  Figure  4.19,  in  which 
residuals  have  been  plotted  against  the  OAFQT  variable. 
There  does  not  appear  to  be  any  patterns  in  the  plots  of  the 
residuals,  and  the  uniform  pattern  was  considered  sufficient 
to  justify  the  assumption  of  homoscedasticity.  Lastly,  since 
each  observation  represents  a  different  person,  the 
independence  of  each  observation  from  one  another  is  assumed 
true . 

e.  Confirmation  of  Regression  Findings 

(1)  Second  Data  Set.  Regression  analysis  was 
conducted  on  the  second  partition  of  the  data  set.  A 
comparison  of  those  results  with  the  first  data  set  is  shown 
in  Table  XIX. 


TABLE  XIX  Comparison  of 

Regression 

Data  Sets 

Independent  Variable 

PRA 

1st  Set 

2nd 

Set 

Coeff  Std  Err 

Coeff 

Std  Err 

Estimator 

OAFQT  .004260  (.00025) 

. 004729 

( .00032) 

HIYRED  .139483  (.00493) 

. 131559 

( .00636) 

PQSCR  .003272  (.00046) 

.003197 

( .00060) 

The  above  results  are  felt  to  be  sufficiently  comparable 
to  accept  the  original  model  coefficient  scores. 

(2)  Nonparametric  Regression.  Since  the  model 
contained  an  ordinal  variable,  HIYRED,  a  regression  result 


using  nonparametric  terms  was  included  as  a  confirmatory 


measure.  Nonparametric  regression  produced  the  same  linear 
least  squares  approximation  for  the  model  estimates,  so  the 
regression  coefficient  for  HIYRED  was  still  0.1395.  However, 
for  nonparametric  regression  the  test  for  the  acceptance  of 
the  estimate  value  used  the  Spearman  rank  correlation 
coefficient.  The  regression  coefficient  for  HIYRED  was 
tested  using  this  procedure. 

First,  for  each  value  of  PRA  and  HIYRED  a  predicted  value 
U  was  found  by  computing  U  =  PRA  -  (0.1395  *  HIYRED).  Then, 
the  Spearman  rank  correlation  coefficient,  rho,  was  computed, 
based  on  the  ranks  of  HIYRED  and  the  ranks  of  U.  It  was 
found  to  be  0.02482  with  a  Pr>IRI  of  0.0001.  In  this  test 
the  null  hypothesis  was  the  value  of  the  regression 
coefficient  was  equal  to  0.1395,  the  value  found  in 
regression.  IRef.  13:pp.  265-271]  To  test  the  null 
hypothesis,  that  the  regression  coefficient  estimate  is 
correct,  rho  was  compared  against  a  rejection  region  computed 
using  the  two  tailed  Spearman  Quantile,  with  a  normal 
approximation.  The  rejection  regions  for  this  Spearman 
Correlation  parameter  were  values  less  than  0.0085  or  greater 
than  0.9915.  Since  the  value  of  rho  did  not  fall  inside 
either  rejection  region,  the  null  hypothesis  could  not  be 
rejected,  and  a  HIYRED  regression  coefficient  of  .1395  was 
acceptable . 


f.  Testing  the  Model 

The  model  coefficients  found  by  regression  were 
tested  in  two  ways.  First,  a  predicted  promotion  rate  valup 
was  computed  for  the  extremes  and  average  of  the  model .  The 
extreme  values  used  the  minimum  or  maximum  values  for  the 
input  variables.  The  average  promotion  rate  was  computed 
using  sample  averages  for  all  input  variables.  The  resulting 
predictions  were  then  be  compared  against  the  actual 
distribution  percentiles. 

Secondly,  subsets  of  the  sample  population  had  average 
promotion  rates  predicted  using  categorical  values  and  sample 
population  averages.  The  resulting  predictions  are  compared 
against  the  actual  sample  values.  Again  percentile  values 
for  PRA  were  found  by  using  a  standard  normal  table 
approximation . 


TABLE  XX  Comparison  of  Extreme 

Model 

Minimum  Prediction 

and  Average  Predictions 

Data 

Sample  Percentile 

PRA  Value 

Percentile 

PRA  Value 

Percentile 

-1 . 0009 

15 .7% 

-1 . 558 

5% 

( . 1000) 

(3.5%) 

Maximum 

Prediction 

Sample 

Percentile 

PRA  Value 

Percentile 

PRA  Value 

Percentile 

1.23029 

89.  IX 

1.7866 

95% 

( .4098) 

(9.9%) 

Averaae 

Prediction 

SamDle 

Percentile 

PRA  Value 

Percentile 

PRA  Value 

Percentile 

0.01839 

50.7% 

-0 . 04146 

50% 

(0.223) 

(8.5%) 

The  model  predictions  were  very  accurate  at  the  average 
level,  but  this  accuracy  diminished  at  the  extremes. 

The  second  test  for  the  model  was  one  where  specific 
population  subcategories  had  their  average  PRA  value 
predicted.  The  subcategories  represented  were  four 
combinations  of  SEX  and  the  black  and  white  RACETH  variables. 
Additionally,  predictions  were  made  to  check  the  average 
promotion  rate  of  all  NCO's  with  a  HIYRED  value  of  10,  and 
all  NCO's  with  an  OAFQT  of  85.  As  in  the  previous  table, 
unless  the  input  variable  is  being  used  as  a  subcategory,  its 
value  was  set  to  the  overall  population  average.  Table  XXI 
shows  the  results  of  the  predictions. 


TABLE  XXI  Comparison  of  Predicted 

vs  Actual 

PRA  Averages 

Subcateaorv 

Predicted  X 

Sample  X 

Sample  Size 

Male/White 

( Lower-Upper ) 
55.1 

(45.7-64.2) 

53.1 

18,003 

Male/Black 

49.5 

(40.3-58.9) 

44 . 3 

12, 121 

Female/Black 

47. 3 

( 37.7-56 . 1 ) 

47.7 

2,485 

Female/White 

52 . 9 

(44.1-61 .5) 

59.5 

1,842 

HIYRED= 10 

71.7 

(63.5-79.3) 

75.7 

969 

OAFQT=85* 

*The  sample  data 
range  of  OAFQT  80 

57.4 

(44.7-69 .4) 
point  estimate 
to  90  . 

60.2  2129 

was  averaged  over  a 

Testing  of  the  regression  model  indicates  that  it  was 

reasonably  effective  if  used  with  input  changes  of  the 

nominal  variables,  such  as  SEX  and  RACETH.  Changes  in  the 
value  of  HIYRED  produces  reliable  estimates,  and  demonstrated 
the  considerable  contribution  of  this  variable  as  a  predictor 
of  PRA.  The  continuous  variable  OAFQT  is  difficult  to  test; 
since  it  is  a  continuous  variable  the  model  estimate  was 
taken  over  a  range  of  values.  Predicted  results  are  close  to 
the  sample  value,  but  the  variance  of  the  estimate  still 
spans  the  median.  OAFQT  does  move  the  predicted  values  of 
PRA  in  the  right  direction,  but  its  effectiveness  is  severely 
hampered  by  its  variance  and  diminishing  ability  to  provide 
an  accurate  prediction  value  as  PRA  approaches  either 
extreme.  Other  prediction  estimates  were  attempted  using 

OAFQT  and  their  results  demonstrated  the  same  lack  of 

predictive  ability  away  from  the  center  percentiles, 
g.  Summary  of  Regression  Analysis 

Regression  analysis  provided  estimates  of  the 
independent  contribution  of  several  key  variables  to 
predicting  a  promotion  rate.  They  include  a  measure  of 
intellgence  aptitude,  OAFQTP,  a  measure  of  academic  ability, 
HIYRED,  two  measures  of  military  performance,  PQSCR  and  NCOE, 
and  two  nominal  values  SEX  and  RACETH. 

Testing  of  these  estimates  shows  that  the  predictive 
ability  of  the  model  is  limited  to  those  variables  which  have 
very  distinct  abilities  to  subcategorize  the  sample 


population.  These  variables  are  the  SEX,  RACETH,  and  HIYRED 
variables.  The  continuous  variables  for  OAFOT,  PQSCR,  cannot 
be  r<  i  upon  to  independently  yield  estimates  of  PRA,  but 
can  cjt  limited  shifts  of  the  PRA  distribution  within  a 
subcategory . 

E.  SUMMARY  OF  FINDINGS 

Chapter  IV  was  the  principal  analytical  exercise  in  this 
study.  It  progressed  through  ascending  stages  of  analysis 
and  resulted  in  an  inferential  model  with  a  restricted  and 
independent  set  of  explanatory  variables.  These  explanatory 
variables  did,  in  fact,  rely  on  levels  of  intellegence  tests 
and  academic  background  as  values  to  predict  promotion. 

The  model,  however,  demonstrated  only  limited  utility  as  a 
preditive  equation.  It  could  only  match  the  sample  data  when 
it  was  describing  an  average  promotion  rate  among  a  large 
population  subcategory.  This  would  occur  only  where  the 
change  in  the  explanatory  variable  had  a  significant 
partitioning  effect  on  the  population. 

The  next  two  chapters  will  investigate  the  relationship  of 
intelligence  and  academic  ability  as  a  predictor  of  promotion 
rate  but  through  different  procedures. 


V.  ANALYSIS  OF  TOP  PERFORMERS 

A.  INTRODUCTION 

This  chapter  took  an  ad  hoc  approach  to  identify  any 
trends  which  distinguish  top  performers,  on  the  basis  of 
promotion  rate,  from  their  peers.  Top  performers  consist  of 
the  top  three  percent  of  the  population,  or  1,047 
individuals,  according  to  PRA  scores.  This  data  set  was 
referred  to  as  the  TOP  data  set,  while  the  remainder  were 
referred  to  as  the  SAMPLE  data  set. 

Analysis  consists  of  three  sections.  The  first  section 
is  a  comparative  tabulation  of  means  and  variances.  Results 
shown  in  this  section  confirmed  the  majority  of  sample 
characteristics  predicted  in  Chapter  IV.,  such  as  higher 
EIMCAT  and  OAFQT  scores.  There  were,  however,  discrepancies 
with  respect  to  TOP  distribution  values  of  RACETH,  NCOE  and 
PAYGD.  Those  discrepancies  are  investigated  in  later 
sections  of  this  chapter.  The  second  section  reports  the 
results  of  formal  hypothesis  testing  for  differences  in  means 
between  each  of  the  explanatory  variables.  The  last  section 
investigates  the  discrepancies  associated  with  RACETH,  NCOE, 
and  PAYGD.  Through  a  presentation  of  graphics  demonstrating 
internal  shifts  of  those  variable  distributions,  an  effect 
which  appears  to  interrelate  the  three  distributional 
discrepancies  is  identified. 


B.  COMPARISON  OF  MEANS  AND  VARIANCE 


The  tabulated  means  and  variances  of  the  study  variables 
for  the  top  three  percent  and  for  the  remainder  of  the  entire 
sample  are  presented  in  Table  XXII.  The  last  column  in  the 
table  shows  the  percentage  and  direction  that  the  TOP  data 
set  differed  from  the  SAMPLE. 


TABLE  XXII 

Top  vs 

Sample 

Summary  Data 

Variable/TvDe 

Top  3% 

Sample 

Comment 

Promotion 

Mean 

Std  Dev 

Mean 

Std  Dev 

RATE 

2.06 

.  392 

0.00 

1 . 00 

PRATE 

.  178 

.037 

.  109 

.  036 

PRA 

2 . 33 

.  350 

0.00 

1 . 00 

Intelliqence 

AFQTP 

64.69 

22.01 

53.4 

20.9 

Top  17. 5% 

> 

OAFQTP 

61.60 

23.24 

45 . 3 

24.7 

Top  26. 4X 

> 

EIMCAT 

6.11 

1 . 31 

5.07 

1 . 28 

Top  17.0% 

> 

GTSCR 

113.17 

14.70 

108.3 

14.2 

Top  4.1% 

> 

HIYRED 

6.88 

1.59 

6.01 

1.07 

Top  12.6% 

> 

EDLVL 

7 . 12 

1.55 

6.32 

.97 

Top  11.2% 

> 

PQSCR 

80 . 57 

11.31 

78.4 

1.6 

Top  2.6% 

> 

NCOE 

2.31 

2.50 

3.06 

2.81 

Top  33% 

< 

Effects 

SEX 

1  . 18 

.  390 

1  .  12 

.  328 

Top  5% 

> 

CMF 

62.09 

27.146 

51 . 9 

31 . 3 

Top  16% 

> 

RACETH 

1 . 58 

.975 

1 . 65 

.  942 

Top  4% 

< 

PAYGD 

5.19 

.  405 

5.27 

.464 

Top  3% 

< 

Observations  derived  from  the  data  in  Table  XXII  can  be 
summarised  as  follows: 


The  four  aptitude  test  variables,  GTSCR,  AFQTP  OAFQTP  and 
EIMCAT,  all  demonstrate  a  strong  positive  difference  between 
the  TOP  and  SAMPLE  scores.  The  AFQT  related  scores  are  about 


twenty  percent  greater,  with  GTSCR  greater  by  four  percent. 


The  variables,  EDLVL  and  HIYRED,  were  both  positive,  with 


HIYRED  slightly  larger  at  twelve  percent,  PQSCR  increased 
slightly . 

The  effects  variables  SEX  and  CMF  both  increased,  with 
CMF  demonstrating  a  significant  increase.  The  change  in  CMF 
was  an  unexpected  result  of  subsetting  to  the  top  three 
percent.  The  PRA  variable  was  designed  to  be  independent  of 
CMF,  and  it  should  not  have  been  affected  as  significantly  as 
it  was . 

The  only  variables  which  decreased  in  proportion  between 
SAMPLE  and  TOP  were  NCOE,  RACETH,  and  PAYGD.  Of  the  three, 
NCOE  was  the  largest.  The  change  in  NCOE  was  also  an 
unexpected  result.  Regression  analysis  indicated  that  NCOE 
had  a  positive  influence  on  PRA.  To  have  NCOE  decrease  with 
top  performers  is  the  reverse  result.  Paragraph  D  of  thi3 
section  will  attempt  to  explain  the  reason  for  this  anomaly. 

C.  SIGNIFICANCE  TESTING 

Significance  testing  for  means  of  the  explanatory 
variables  between  the  TOP  and  SAMPLE  data  set  was  included  as 
a  formal  statistical  confirmation  of  differences  between  the 
two  data  sets.  Testing  using  nonparametr ic  methods  was 
utilized  since  the  study  variables  were  either  discrete,  or 
if  continuous,  did  not  meet  the  Kolmogorov-Smirnov  one-sample 
test  for  a  normal  distribution.  The  type  of  nonparametr ic 
test  used  is  dependent  on  the  type  scale  of  the  variable  and 
whether  it  was  continuous  or  discrete. 
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TABLE  XXIII  Top  vs  Sample  Hypothesis  Results 


Variable  Test  Used  Results 

Intelligence 


GTSCR 

Kruskal-Wallis  Test  1 

Chisq  =  671 

Strongly 
reject  HO: 

AFQTP 

Kruskal-Wallis  Test 

Chisq  =  1165 

Strongly 
reject  HO: 

OAFQTP 

Kruskal-Wallis  Test 

Chisq  =  1418 

Strongly 
reject  HO: 

EIMCAT 

2XC  Contingency  Table* 

Chisq  =  503 

Strongly 
reject  HO: 

HIYRED 

2XC  Contingency  Table 

Chisq  =  931 

Strongly 
reject  HO: 

EDLVL 

2XC  Contingency  Table 

Chisq  =  700 

Strongly 
reject  HO: 

POSCR 

NCOE 

Kruskal-Wallis  Test 

2  x  C  Contingency  Table 

Chi3q  =  26.1 

Reject  HO: 

Effects 


SEX 

2  1 

C 

Contingency 

Table 

Chisq  = 

CMF 

2  * 

C 

Contingency 

Table 

Chisq  = 

Strongly 
reject  HO: 

RACETH 

2  * 

C 

Contingency 

Table 

Chisq  = 

Reject  HO: 

PAYGD 

2  * 

c 

Contingency 

Table 

Chisq  = 

Strongly 
reject  HO: 

1  For  this  nonparametric  test  the  null  hypothesis  is  that 
the  populations  are  identical.  The  alternate  hypothesis  is 
that  one  of  the  populations  yields  larger  observations.  With 
two  populations  this  is  equivalent  to  a  Mann-Whitney  test. 
At  a  level  «  of  .95  the  critical  Chisquare  value  for 
rejection  is  Chisq  >  3.84. 

2For  this  nonparametric  test  the  null  hypothesis  is  that 
the  two  populations  have  the  same  distribution  as  measured  by 
the  probability  of  falling  into  one  of  the  discrete  variable 
classifications.  The  alternate  hypothesis  is  that  the 
distributions  are  different.  The  contingency  table  is  set 
for  the  two  rows  to  be  the  classification  of  PRA  >  1.93  and 
PRA  <  1.93,  the  C  represents  the  number  of  discrete  levels  in 
the  variable  being  tested.  The  Chisquare  test  statistic  is 
also  used  for  this  test  with  a  rejection  of  HO:  when  Chisq  is 
larger  than  3.84  at  a  .95  level  a. 
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Hypothesis  testing  confirms  the  observations  made  on 
simple  means  and  variances  of  the  study  variables.  The 
strength  of  the  difference  can  be  interpretated  by  the 
magnitude  of  the  Chi-square  statistic. 

D.  ANALYSIS  OF  DISTRIBUTIONS 

This  section  further  investigates  the  shifts  in 
distributions  for  those  variables  which  conflicted  with  the 
relationships  derived  in  regression  and  correlation  analysis. 
Those  variables  were  CMF,  NCOE  and  PAYGD.  Again,  the 
conflicts  which  arose  were  two-fold. 

First,  neither  CMF  or  PAYGD  should  have  been  affected  by 
subsetting  of  the  PRA  variable.  The  PRA  scores  are  normalized 
differences  from  the  average  score  for  every  paygrade  and  CMF 
combination.  Assuming  a  uniform  application  of  promotion 
policy  then,  no  one  CMF  or  paygrade  should  have  dominated  as 
a  result  of  subsetting  to  the  top  three  percent.  Secondly, 
NCOE  should  have  increased  slightly  rather  than  decreased 
significantly  by  subsetting  to  the  top  three  percent. 

The  three  inconsistencies  appear  to  be  linked  in  their 
distributional  change.  Observation  of  the  three  Figures  5.1, 
5.2,  and  5.3.  demonstrate  this. 


the  sample  density  away  from  the  NCOE  7  to  the  NCOE  0  level. 
This  was  consistent  with  the  observations  in  Figure  5.1, 
since  only  combat  arms  NCO's  qualify  for  level  7,  the  Combat 
Arms  Primary  Leadership  course. 


TOP  VS  SAMPLE  PAYGD 
CLUSTER  BAR 


0  TOP 
El  SAMPLE 


£-6 

PAYGD 

F igure  5 . 3 


The  last  figure.  Figure  5.3,  shows  a  displacement  of 
percentage  from  the  E-6  to  the  E~5  paygrade  as  a  result  of 
extracting  only  the  top  three  percent  by  measure  of  promotion 


rate . 


To  offer  an  explanation  of  the  underlying  reason  for 


these  discrepancies  is  difficult. 


Some  measure  of  this 


discrepancy  may  well  be  explained  in  that  the  removal  of 
effects  by  normalizing  the  PRA  scores  was  not  entirely 


adequate . 


The  observed  discrepancy  may  be  simple 


mathematical  error.  However,  it  can  be  noted  that  their 
interrelationships  do  act  consistently.  Specifically,  the 


m 


reduction  in  paygrade  and  combat  MOS ' s  both  combine  to 


significantly  reduce  the  NCOE  level.  As  such,  it  is  more 
likely  that  change  in  NCOE  occured  coincident  with  the 
changes  in  the  two  variables  PAYGD  and  CMF.  The  effect  being 
demonstrated  was  one  where  junior  combat  service  support 
NCO’s  were  dominating  promotion  achievement. 

E.  SUMMARY  OF  FINDINGS 

Comparing  the  changes  in  averages  for  the  top  performers 
to  the  regression  coefficients  found  in  Chapter  IV,  shows 
very  substantial  agreement.  Specifically,  OAFQT  was  the  most 
significant  intelligence  test  variable,  while  HIYRED  was  the 
most  significant  academic  variable.  Although  the  percent 
change  in  OAFQT  is  greater  than  HIYRED,  it  still  has 
considerably  more  variance  than  HIYRED.  Thus,  the  predictive 
ability  of  HIYRED  in  regression  should  be  more  pronounced 
than  that  of  OAFQTP .  The  less  significant  variables  of 
PQSCR,  SEX,  and  RACETH  each  shifted  a  small,  significant 
amount  in  the  appropriate  direction. 

The  only  discrepancy  between  the  two  procedures  is  the 
change  in  the  variable  NCOE.  This  change  is  felt  to  have 
been  induced  by  changes  in  the  CMF  and  PAYGD  distributions. 
The  effect  is  one  where  junior  combat  service  support  NCO's 
replace  NCO's  from  the  combat  MOS's. 

An  important  observation  from  analysis  of  the  top  three 
percent  was  that  the  increase  in  the  value  of  any  explanatory 
variable  was  not  extreme.  In  fact,  the  largest  increase  was 
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only  twenty-five  percent.  As  an  inference,  it  appears  that 


NCO's  who  do  a  little  better  in  a  combination  of  areas, 
rather  than  much  better  in  a  single  area,  are  more  likely 
recipients  of  faster  promotion  rates. 


VI .  PRINCIPAL  COMPONENTS  AND  FACTOR  ANALYSIS 


A.  INTRODUCTION 


In  this  chapter  more  advanced  statistical  procedures  are 
implemented  to  better  summarize  the  independent  variables, 
and  improve  or  at  least  simplify  the  cause-effect  model. 
Principal  components  and  factor  analysis  are  two  closely 
related  procedures  which  are  normally  used  in  investigating 
the  mutual  relationships  and  communalities  of  a  large  number 
of  variables.  By  identifying  redundant  variables,  and  by 
constructing  composite  variables  of  the  originals,  it  is 
possible  to  reduce  the  number  of  independent  explanatory 
variables  to  only  those  which  are  significant  and  unique. 


B.  THEORY 

Principal  components  and  factor  analysis  each  use  matrix 
algebra  to  operate  on  a  P  by  P  matrix  of  correlation  or 
covariance  coefficients  and  produce  system  of  eigenvectors 
of  the  form: 

Y( j i  =  at j Xj  ♦  at j X»  ♦  ..a»jX,  ♦  E.  In  the  notation,  Yij> 
represents  the  resultant  composite  variable  which  is  the 
linear  combination  of  the  loading  coefficients,  at  j  .  These 
loading  coefficients  multiply  each  of  the  original  variables 
X« ,  n=l..p.  E  represents  the  amount  of  residual  error  not 
accounted  by  the  linear  model. CRef.  5:p.  328]  The 
resulting  eigenvectors  represent  a  set  of  orthogonal 
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components  jointly  perpendicular  in  the  space  of  the  original 


variables.  [Ref.  15:p.  4243  These  components  are  jointly 
uncorrelated  and  individually  account  for  levels  of  variance, 
where  the  first  principal  component  accounts  for  the  largest 
proportion,  and  the  last  principal  component  accounts  for  the 
smallest.  A  resulting  component  may  be  representative  of 
some  aggregate  characteristic  of  the  original  input 
variables.  For  example  a  resulting  eigenvector  which  has 
strong  factor  loadings  for  original  variables  of  physical 
strength  and  endurance  could  be  called  a  factor  of  stamina  as 
an  aggregate  measure.  Principal  components  and  factor 
analysis  differ  in  that  principal  components  assume  and 
require  that  number  of  components  equal  to  the  number  of 
initial  variables  is  needed  to  account  for  the  total 
variance.  In  contrast,  the  factor  method  assumes  that  there 
exists  a  set  of  composites  in  a  dimension  smaller  than  the 
dimension  of  the  original  number  of  variables  which  will 
suf f ice . [ Ref .  5:p.  6223 

An  additional  aspect  of  factor  analysis  is  that  it  allows 
for  rotation  of  the  solution  with  the  intent  of  developing 
more  unique  and  well-defined  components.  For  example  if 
there  are  five  variables  in  a  factor  which  have  intermediate 
loading  factors  in  the  range  .2  to  .4,  a  rotation  of  common 
factors  by  applying  nonsingular  linear  transformations  may 
result  in  a  pattern  matrix  in  which  the  loadings  are  either 
zero  or  close  to  one.  The  end  result  is  easier  to  interpret 
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than  the  factor  with  numerous  mixed  elements.  Graphical 
measure*  are  useful  with  the  rotation  procedure  and  allow  the 
analyst  to  see  the  relative  uniqueness  of  the  input 
variables . 

C .  RESULTS 

The  SAS  procedure  for  performing  factor  analysis  was  used 
with  the  method  of  factor  determination  being  the  principal 
component  method.  As  such,  basic  principal  component 
analysis  was  conducted,  but  limits  were  applied  on  the  number 
of  factors  retained  so  that  only  the  most  significant 
composite  factors  would  be  kept.  The  first  set  of  input 
variables  included  all  of  the  twelve  study  variables.  Table 
XXIV  shows  the  resulting  factor  solution.  Appended  below 
each  component  is  an  interpretation  explaining  what  the 
aggregate  factors  represent.  The  original  input  variables 
which  contributed  most  to  the  factor  have  been  underlined. 
Following  Table  XXIII  is  a  factor  plot.  Figure  6.1,  where 
each  of  the  variables  is  coded  by  a  letter.  By  observing  the 
plot,  any  lack  of  uniqueness  for  a  group  of  variables  can  be 
noted  where  the  coded  letters  are  close  to  one  another. 
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TABLE  XXIV  Principal  Components  Tabular  Results 


Input  Matrix  of  correlation  coefficients 
PRIOR  COMMUNALITY  ESTIMATES:  ONE 

1  2  3  4  5  6  7 

EIGENVALUE  4.0052  1.7334  1.4979  1.0634  0.8496  0.8028  0.7542 

DIFFERENCE  2.2717  0.2355  0.4344  0.2138  0.0468  0.0486  0.2149 

PROPORTION  0.3338  0.1445  0.1248  0.0886  0.0708  0.0669  0.0628 

CUMULATIVE  0.3338  0.4782  0.6031  0.6910  0.7625  0.8294  0.8922 

8  9  10  11  12 

EIGENVALUE  0.5392  0.3500  0.2809  0.1196  0.0034 

DIFFERENCE  0.1892  0.0690  0.1613  0.1161 

PROPORTION  0.0449  0.0292  0.0234  0.0100  0.0003 

CUMULATIVE  0.9372  0.9663  0.9897  0.9997  1.0000 

7  FACTORS  WILL  BE  RETAINED  BY  THE  NFACTOR  CRITERION 

FACTOR  PATTERN 


FACT1 

FACT2 

FACT3 

FACT4 

FACT5 

FACT6 

FACT7 

EDLVL 

.  4302 

.  5861 

.  5024 

-  .  2544 

-  .  0624 

- . 0693 

-  .  029 

AFQTP 

.9515 

- .1133 

-  .1195 

.0637 

-  .0075 

.  1548 

-  .024 

EIMCAT 

.9060 

- . 1220 

-.1652 

-.0598 

-  .  0096 

.  1478 

.011 

NCOE 

-  .0085 

- .4507 

.6668 

.2527 

-  .0398 

.0084 

-  .134 

HIYRED 

.3834 

.6410 

.4176 

-  .  3281 

-  .0637 

-  .0830 

-  .  124 

SEX 

.  1735 

.4212 

-  .  1113 

.6516 

.  1857 

-  .  0736 

-  .  550 

OAFQT 

.9518 

- . 1046 

-  .1156 

.  0590 

-  .  0092 

.  1535 

-  .  023 

GTSCR 

.8238 

- .1128 

.  0090 

.0331 

-  .  0464 

.  1350 

.  132 

PQSCR 

.  4001 

-  . 2413 

.  1205 

-  .  1150 

-  .7312 

-  .4527 

.115 

CMF 

.  1677 

.5200 

-  .  1449 

.  4985 

- . 1171 

- . 2587 

.  561 

PAYGD 

.  1216 

- . 3467 

.  6770 

.  3367 

-  .  1816 

- . 0495 

.151 

RACETH 

-  .3590 

.3130 

.  2547 

.  1229 

.  4708 

.  6507 

.216 

Intell 

Acad 

Career 

Sex 

PQSCR 

RACE 

CMF 

Tests 

Status 

FINAL  COMMUNALITY  ESTIMATES:  TOTAL 


10.706622 


PLOT  OF  FACTOR  PATTERN  FOR  FACTOR1  AND  FACTOR3 

FACTOR1 
B  1 
C  G  .9 
.H 
.7 
.6 


.  5  A 

.4  1  E 

.3  F 

JF  .2  A 

.1  K  C 

- . 9- .8- .7- . 6- . 5- .4- . 3- . 2- . 1  0  .1  .2  .3  .4  .5  .6  D7  .8  .9 

-  .  1  T 

-  .  2  0 

-.3  L  R 

-  .4  3 


-.5 

-  .6 

-  .7 

-  .8 
-  .9 
-1 

EDLVL= A  AFQTP=B  EIMCAT=C  NCOE=D  HIYRED=E  SEX=F 
OAFQT=G  GTSCR=H  PQSCR=I  CMF=J  PAYGD=K  RACETH=L 


Figure  6.1 

The  results  appear  to  quite  reasonable,  wh**  e  most 
significant  factor  is  a  composite  of  all  the  menti  ..titude 
measures:  OAFQTP,  AFQTP  GTSCR,  and  EIMCAT.  Tn&  cond 
factor  consists  primarily  of  academic  performance  measures 
EDLVL  and  HIYRED.  The  third  factor  is  composed  of  NCOE  and 
PAYGD  and  reflects  two  closely  related  measures  dominated  by 
paygrade.  The  fourth  factor  is  predominantly  a  measure  of 
SEX  and  two  other  nominal  variables,  CMF  and  PAYGD.  The 
fifth,  sixth  and  seventh  factors  all  appear  to  be  dominated 
by  single  variables,  PGSCR,  RACE,  and  CMF  respectively. 


In  short,  each  of  the  original  twelve  variables  i3  in 


I 


i 


.•Jj 

4 

>•3 

.a 


some  measure  represented  in  the  five  factors,  the  first  five 
factors  accounting  for  over  seventy  five  percent  of  the 
variance.  By  observing  the  entry  for  PROPORTION  one  can  see 
that  the  subsequent  seven  factors  each  contributed  between 
.0668  to  .0028  of  the  variance  and  as  such  are  not  major 
contributors . 

Using  the  results  of  the  first  solution  a  second  analysis 
was  conducted  with  a  reduced  number  of  input  variables.  In 
each  of  the  initial  solution  factors  the  single  variable 
having  the  largest  loading  factor  was  selected  and  the  other 
related  variables  were  eliminated.  Table  XXI  shows  the 
results  of  that  solution,  and  Figure  6.2  shows  the  Factor 
Plot . 


TABLE  XXV 


Reduced  Principal  Components  Tabular  Results 


PRIOR  COMMUNALITY  ESTIMATES:  ONE 
Input  Matrix  of  correlation  coefficients 


1  2 

EIGENVALUE  2.1666  1.2063 
DIFFERENCE  C . 9602  0.2044 
PROPORTION  0.3095  0.1723 
CUMULATIVE  0.3095  0.4819 


3  4  5  6  7 

1.0019  0.8703  0.8049  0.7081  0.2416 
0.1315  0.06540.09670.4665 
0.1431  0.1243  0.1150  0.10120.0345 
0.6250  0.7493  0.8643  0.96551.0000 
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WILL  BE 
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BY  THE 

NFACTOR 

CRITERION 

FACTOR 

PATTERN 

FACT1 

FACT2 

FACT3  FACT4  FACT5  FACT6 

FACT7 

NCOE 

.  0221 

-  .  5422 

.  6941 

.  2656 

-  .  3801 

- . 1071 

.018 

HIYRED 

.  3659 

.  5302 

.  3135 

- .5162 

- . 2443 

- . 4001 

-  .004 

SEX 

.  1803 

.6532 

.1514 

.  6993 

.  0899 

- . 1346 

-  .  051 

OAFQT 

.  8945 

.  0404 

-  .0412 

.0502 

-  .  0668 

.  2462 

-  .  328 

GTSCR 

.  8592 

-  .  0374 

.0154 

- .0492 

-  .  1259 

.  3664 

-  .  328 

PQSCR 

.  5069 

-  .  3707 

.  2537 

-  . 0613 

.7141 

- . 2648 

-  .  022 

RACETH 

-  .4521 

.  3275 

.  5799 

- . 1589 

.  2487 

.  5031 

.037 

Intell 

Test3 

Acad 

NCOE 

SEX 

PQSCR 

Race 

FINAL  COMMUNALITY  ESTIMATES:  TOTAL  =  7.000000 

NCOE  HIYRED  SEX  NOAFQT  GTSCR  PQSCR  RACETH 

1.0000  1.0000  1.0000  1.0000  1.0000  1.0000  1.0000 


PLOT  OF  FACTOR  PATTERN  FOR  FACT0R1 
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PQSCR-F  RACETH^G 

Figure  6.2  Factor  Plot 

Restricting  the  input  to  the  strongest  unique  variables 
results  in  an  almost  complete  separation  into  single  factors. 
The  only  exception  is  the  grouping  of  GTSCR  and  OAFQT,  (E  and 
D).  This  is  not  suprising  considering  the  composition  of 
both  scores  from  the  same  set  of  tests  in  the  ASVAB.  Thus, 
the  decision  to  eliminate  GTSCR  from  earlier  regression 


models  makes  3ense  from  the  Factor  Analysis  perspective  as 


E. 


SUMMARY  OF  FINDINGS 


The  application  of  principal  components  and  factor 
analysis  confirmed  many  of  the  patterns  of  dependency  and 
redundancy  with  the  study  variables.  It  confirmed  the 
choices  for  unique  variables  in  the  regression  as  developed 
in  Chapter  IV,  and  gave  a  good  second  opinion  for  deciding 
which  variables  could  be  set  aside  with  little  effect  on  the 


VII .  CONCLUSION 


A.  OVERALL  FINDINGS 

There  i3  strong  statistical  evidence  to  support  the 
proposition  that  success  in  the  Army,  as  measured  by 
promotion  rate,  is  related  to  the  individual's  intelligence 
test  scores  and  previous  academic  background.  The 
explanatory  variables  of  the  1980  normed  AFQT  score  and  the 
individual's  highest  year  of  education  at  time  of  entry  are 
the  most  important  indicators  for  a  future  promotion  rate. 
The  highest  year  of  education  at  time  of  entry  is  the  more 
important  measure,  but  changes  in  its  discrete  scale 
represents  very  substantial  changes  in  academic  background. 
OAFQT  is  not  nearly  as  important  as  HIYRED  and  can 
independently  affect  the  predicted  promotion  rate  only  up  to 
ten  percent . 

While  in  service,  how  well  the  individual  scores  on  his 
Performance  Qualification  Test  Scores  and  his  attendance  at 
NCO  schooling  will  be  indicative  of  a  faster  promotion  rate. 

The  statistical  evidence  for  these  observations  can  be 
argued  by  showing  the  existence  of  significantly  increasing 
promotion  rate  averages  across  ascending  levels  of 
explanatory  measures  in  ANOVA  and  ANCOVA  analysis.  This 
argument  can  be  supplemented,  and  those  differences  seen  more 


concretely,  by  a  simpler  comparison  of  top  performers  verses 


the  sample  averages . 


Considerable  variance  of  promotion  rate  exists  across  any 
of  the  levels  of  the  discrete  explanatory  variables,  and 
within  any  of  the  categorical  variables.  There  is  a  dilemma 
in  designing  an  effective  dependent  variable.  While 
controlling  categorical  variables  such  as  CMF  and  Paygrade, 
the  effects  of  the  other  variables  become  more  apparent  and 
significant.  However,  the  ability  of  the  model  to  explain 
variance  is  significantly  diminished. 

Selecting  a  set  of  the  most  important  and  unique 
explanatory  variables  was  achieved  via  two  methods.  A 
successive,  increasing  dimension  procedure  distilled  a  set  of 
unique  explanatory  variables.  This  method  relied  upon 
developing  detailed  familiarity  with  each  variable.  In  the 
process  hypothesis  testing  was  used  to  eliminate 
insignificant  contributors  and  identify  the  most  important 
variable  from  a  group  of  related  variables.  This  restricted 
sot  of  explanatory  variables  was  confirmed  with  the  use  of 
principal  components,  a  method  which  uses  a  mathematical 
approach  to  identify  orthogonal  and  unique  variables. 

When  using  inferential  procedures  the  resulting  model 
met  regression  assumptions,  both  parametrically  and 
nonparametrically .  Further,  the  model  estimates  are 
reproducable  with  an  alternate  data  set. 

Although  the  model  i3  technically  acceptable,  it  is  only 
accurate  in  predicting  promotion  values  for  population 

121 


1. 

r 


subcategories.  The  low  R2  value  and  high  mean  square  error 
terms  found  during  regression  were  manifested  in  model 


testing , 


When  making  predictions  based  on  incremental 


changes  in  AFQT  the  sample  data  values  were  close,  but  upper 
and  lower  bounds  were  so  large  that  resulting  predictions 
were  not  usefull. 

The  poor  performance  of  the  predictive  model  can  be 
attributed  to  two  possible  reasons.  First,  that  there  exists 
some  unspecified  predictor  variable  which  could  be  used  to 
better  account  for  variance.  Or  secondly,  there  exists 
significant  inexplicable  chance  in  the  occurance  of  a 
promotion  rate  for  any  given  individual. 


i 


In  the  case  of  the  first  reason,  it  should  be  observed 
that  the  number  of  available  entries  held  on  a  given 
individual  at  either  DMDC  or  MILPERCEN  is  limited.  Of  the 
one  hundred  and  forty  data  fields,  this  study  considered  all 
entries  which  were  felt  to  have  potential  merit  as  an 


explanatory  variable. 


This  included  several  versions 


expressing  the  same  fundamental  quality.  Of  the  twelve 
variables  considered  the  final  number  of  significant 
variables  was  reduced  to  only  six.  Overall,  there  are  few 
significant  and  unique  measures  available  to  use  as 
predictors.  To  discover  additional  explanatory  variables 


would  require  establishment  of  new  personnel  data  elements  in 
those  data  bases.  Potential  candidates  include  evaluation 
report  averages,  or  possibly,  the  results  of  a  personality 


composite  test.  Alternatively,  the  quality  of  information  on 
academic  performance  could  be  increased,  such  as  the 
inclusion  of  grade  averages  from  high  school  attendance 
periods.  The  utility  of  thi3  additional  data  would  then  have 
to  be  evaluated  in  a  manner  similar  to  this  thesis. 

The  second  reason  given  for  error  is  a  more  probable 
explanation,  for  the  subject  matter  of  this  study  is  people, 
and  not  a  more  deterministic  physical  phenomenon.  The 
resolution  of  a  cause  effect  relationship  is  more  subtle  and 
more  difficult  to  verify.  Although  this  condition  does  not 
have  a  mathematical  remedy,  the  judgement  of  whether  or  not 
even  a  small,  highly  variable  measure  of  trend  is  sufficient 
3till  lies  with  the  analyst  and  his  ability  to  present  that 
judgement  to  decision  makers. 

B.  POLICY  RECOMMENDATIONS 

The  first  question  that  must  be  answered  in  thi3  section 
is  whether  or  not  having  a  predictive  model  is  necessary  to 
make  policy  decisions  regarding  promotion  or  accession.  The 
answer  offered  in  this  document  is  that  it  i3  not.  There  is 
sufficiently  reliable  information  resulting  from  hypothesis 
testing  and  subpopulation  analysis  to  make  cogent 
observations  and  decisions  with. 

From  the  results  of  this  investigation,  accession  policy 
makers  should  closely  manage  the  two  attributes  of  OAFQT  and 
HIYRED.  Thi3  recommendation  is  more  a  confirmation,  rather 
than  a  proposal.  The  1984  Defense  Authorization  Act  already 
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promotion  data  approach,  but  rather,  a  duration  model 
approach  with  a  set  group  of  individual  soldiers  over 
time. [Ref.  ll:pp.  7-9]  His  paper  reports  that  this  disparity 
is  a  result  of  attrition.  Specifically,  the  shifting  of 
subcategory  promotion  averages  is  a  result  of  different 
retention  patterns  among  race  and  ethnic  groups,  and  not  due 
to  a  racialy  sensitive  promotion  system. 

A  study  to  determine  the  magnitude  and  underlying  reasons 
for  the  different  retention  patterns,  and  to  test  this 
hypothesis,  would  have  considerable  merit. 
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CAREER  MANAGEMENT  FIELDS  AND  FREQUENCIES 
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FREQUENCY 

PERCENT 

FREQUENCY 

PERCENT 

I nf antry 

11 

4320 

11.4 

4320 

11.4 

Cbt  Engineer 

12 

1030 

2.7 

5350 

14.1 

Artillery 

13 

2780 

7.3 

8130 

21 . 5 

Air  Defense 

16 

851 

2.2 

8981 

23.7 

Special  Ops 

18 

244 

0.6 

9225 

24 . 4 

Armor 

19 

2434 

6.4 

11659 

30 . 8 

Hawk  Missile 

23 

187 

0.5 

11846 

31 . 3 

Nike  Missile 

27 

352 

0.9 

12198 

32 . 2 

Tac  Radar 

28 

40 

0 . 1 

12238 

32 . 3 

Tac  Radar 

29 

625 

1.7 
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34 . 0 

Communication 

31 
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8 . 6 
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42 . 6 
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33 

30 

0.1 
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51 
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54 
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67 
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74 
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79 
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0.3 

29829 

78 . 8 
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81 

65 

0.2 
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AV  Spec 

84 
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0.4 
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79 . 4 

Medical 

91 
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6 . 6 
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86 . 0 

Lab  Spec 

92 
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1 . 2 
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87.2 
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93 
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AFQT  TRANSFORMATION  EQUIVALENT  SCORES 

Armed  Forces  Qualification  Test  (AFQT) 
Equivalent  Percentile  Scores  for  1944 
Mobilization  Population  and  1980  Youth  Population 
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